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ABSTRACT 

The  purpose  of  this  thesis  is  to  start  the  evaluation  of  the  Transputer,  a  32  bit 
microprocessor  on  a  chip,  to  verify  its  potentials  and  limitations  for  real  time 
applications,  in  distributed  systems. 

The  evaluation  concentrates  on  the  four  physical  communication  links,  and  its 
advertised  capability  to  operate  in  parallel  with  the  main  processor  (CPU),  each  one  of 
them  at  rate  of  10  mbit/sec  in  each  direction.  It  also  presents  to  the  reader  an 
introduction  to  the  machine  itself,  to  the  Occam  Programming  Language,  a  description 
of  the  environment  at  the  Naval  Postgraduate  School(NPS),  and  suggests  to  the  novice 
a  learning  sequence. 

The  evaluation  programs  and  other  example  programs  presented  in  this  thesis 
were  implemented  using  the  Occam  Programming  Language  (Proto-Occam)  in  either 
the  Occam  Programming  System  (OPS)  or  the  Transputer  Development  System  (TDS), 
both  resident  on  the  VAX  11/780  computer  under  the  VMS  Operating  System 
(VAX/ VMS). 
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I.  INTRODUCTION 

A.       BACKGROUND 

The  NPS  AEGIS  project  has  in  its  primary  goals  the  research  and  development 
of  alternative  architectures  for  the  AEGIS  Combat  Weapon  System  (CWS),  focusing 
on  low  cost,  reliable  and  fault  tolerant  architectures.  As  the  cost  of  micro-processors 
has  been  decreasing  incredibly  and  the  capabilities  are  always  increasing,  it  turns  out  to 
be  very  attractive  to  think  of  using  these  cheap  and  powerful  tools  to  accomplish  the 
functions  of  any  system. 

One  branch  of  this  research  is  based  upon  the  Intel  86/12A  Single  Board 
Computers  that  are  working  under  the  M CORTEX  operating  system  [Ref  1],  fully 
developed  at  the  AEGIS  lab.  It  exploits  the  lOmbits/second  capacity  of  the  Intel 
MULTIBUS  and  uses  the  concept  of  shared  memory  to  allow  multiprocessors  arranged 
in  clusters  of  up  to  eight  single  board  computers,  to  increase  the  throughput  of  the 
system.  Each  cluster  has  its  own  shared  memory  whose  access  is  controlled  by  means 
of  eventcounts.  The  clusters  intercommunicate  through  an  Ethernet  link  [Ref  2]. 

One  alternative  concept  for  distributed  systems  is  the  use  of  message  passing 
[Ref  2,3].  The  Transputer  concept  exploits  this  idea  and  produces  a  very  interesting 
and  flexible  way  of  designing  multiprocessor  systems.  This  second  branch  of  research  is 
now  in  its  third  released  work^  and  is  increasing  in  importance  and  extent. 

This  thesis  was  developed  in  parallel  and  concurrently  with  the  one  from 
Cordeiro.  M.  M.  [Ref  6].  Since  these  theses  were  in  fact  the  first  to  really  program 
this  new  machine,  at  the  NPS,  a  series  of  obstacles  were  encountered  and  overcome 
one  by  one,  up  to  the  point  we  were  able  to  divide  the  work,  and  on  our  own,  search 
for  the  answers  we  were  individually  seeking.  This  is  the  reason  why  we  tried  to  point 
out  many  of  the  pitfalls  that  one  may  encounter  in  future  research  in  this  area  using 
the  system  available  at  the  NPS. 


'See  B.  Evin  Implementation  of  A  Serial  Delay  Insertion  Type  Loop 
Comniunication  for  a  Real  Time  Multi-Transputer  System  [Ref  4]  and  Selcuk,  Z., 
Implementation  of  a  Serial  Communication  Process  for  a  Fault  Tolerant,  Real  Time, 
Multi-Transputer  Operating  System  [Ref  5]. 
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1.  Intended  Audience 

This  will  be  a  good  first  reading  for  the  person  beginning  to  work  with 
transputers  or  Occam.  Appendix  A  presents  a  Learning  Sequence;  Appendix  B  presents 
an  OPS  Tutorial;  Appendix  C  presents  a  TDS  Tutorial  and  Appendix  D  provides  some 
hints  on  how  to  program  in  Occam.  It  also  will  be  a  good  reference  for  transputer 
users  and  real-time  system  designers  and  implementors  in  the  sense  of  what  they  can 
expect  in  terms  of  performance  from  the  INMOS  links.  They  will  be  able  use  the 
Tables,  Graphics  and  the  Evaluation  Programs  to  check  and  confirm  their  expectations 
in  issues  concerning  what  should  be  the  right  construct  or  the  right  message  size  to  use, 
in  order  to  achieve  the  desired  throughput  or  communication  rate. 

B.       WHY  THE  TRANSPUTER 

The  TRANSPUTER  is  a  member  of  a  family  of  micro-processors,  that  have  on 
one  chip,  the  processor,  its  own  local  memory  and  links  for  point  to  point  connections 
to  other  transputers. 

Each  transputer  product  contains  special  circuitry  and  interfaces  adapting  it  to 
each  particular  use.  For  example  a  peripheral  control  transputer,  such  as  a  graphics  or 
disk  controller,  has  interfaces  tailored  to  the  requirements  of  a  specific  device  [Ref  7]. 

The  transputers  were  designed  in  parallel  with  the  Occam  programming  language 
and  were  first  released  in  1985.  Now,  two  years  later,  there  is  a  growing  variety  of 
transputers  available  on  the  market  with  different  capabilities  and  for  different 
applications.  Some  of  these  are  hsted  in  Appendix  G. 

The  T-414  is  a  32  bit  micro-processor  with  2  kbytes  of  on  chip  RAM,  four 
standard  INMOS  serial  links,  external  memory  interface  and  peripheral  interfacing  on  a 
single  L5  micron  CMOS  chip.  As  an  example,  its  characteristics  and  technical  data  are 
summarized  in  Table  1,  and  its  internal  architecture  is  depicted  in  Figure  1.1  . 

For  the  sake  of  comparison,  Table  2  lists  the  processor  cycle  time  or  internal 
clock  of  other  commercially  available  computers  and  also  some  processors  used  in 
military  applications  for  real-time. 

The  internal  architecture  of  the  transputer  follows  Von  Newman  principles  and 
permits  the  processor  itself  to  run  at  the  same  time  as  the  4  links  operate.  This  way  a 
high  level  of  parallelism  is  achieved  already  on  chip  level. 


■Reproduced  by  permission  of  INMOS  Corporation. 

15 


TABLE  1 
TRANSPUTER  T-414  TECHNICAL  DATA  AND  CHARACTERISTICS 

processor     internal      instruction 
cycle  time     clock       throughput 

(T  414-20) 50  nsec  ....  20    mhz  ....  10    MIPS 

(T  414-15) 67  nsec  ....  15    mhz  ....   7.5  MIPS 

(T  414-12) 80  nsec  ....  12.5  mhz  ....   6.0  MIPS 

external  clock  cycle 5  mhz 

time  slice 1  msec  (  approximately) 

internal  bus  speed  80  mega  bytes/second 

internal  (on  chip)  memory.   2  kilo  bytes 

internal  memory  cycle  ....  50  nsec  (for  50  nsec  cpu) 

external  memory  interface.  25  mega  bytes/second  bus 

external  memory  cycle  ....  150  nsec 

address  capability  4  giga  bytes( 32  bit  address) 

links  (serial) 4  (full  duplex,  DMA) 

link  bit  rate 10  mbits/sec  ( 20  mbits/sec) 

link  net  bit  rate  (Obs.  1)   3.8  mbits/sec  (6.1  mbits/s) 

power  dissipation  less  than  500  milliwatts 

physical  dimensions  45  mm  square  chip  (84  pins) 

Obs.  1:  These  values  refer  to  the  immediately  above 
mentioned  bit  rates,  respectively,  and  are 
fully  explained  on  chapter  III. 


When  reading  transputer  related  material,  one  may  fmd  references  to  T-424.  This 
was  a  prototype  that  is  not  on  the  market  anymore. 

The  systems  architecture  is  simplified  by  using  the  transputer  links  for  point  to 

point  communications  which  allows  the  available  transputers  to  be  configured  in  any 

desired    topology    matching    the    programmer    needs    [Ref  4,8).      Point    to    point 

communication  links  have  many  advantages  over  multiprocessor  buses  [Ref  7]: 

•  There  is  no  contention  for  the  communication  mechanism,  regardless  of  the 
number  of  transputers  in  the  system  (that  does  not  happen  in  shared  memory 
systems)  [Ref  9]. 
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TABLE  2 
PROCESSOR  CYCLE  TLME/CLOCK  EXAMPLES 

SBC  86/12A (1  to  8  mhz ) 

30286  -  6  to  12  mhz ) 

80386  -  16  mhz ) 

Transputer  T  414-20...   50  nsec  (20mhz) 

AN/UYK  7  -  750  nsec 

AN/UYK  43  -  150  nsec 

VAX  11/780  -  200  nsec 

IBM  3033  -  57  nsec 

IBM  3081  (k)  -  26  nsec 


•      There  is  no   capacitive  load  penalty  as  transputers  are  added  to  a   system, 
(specially  if  they  will  work  independently) 

However  as  the  number  of  transputers  increase  in  the  system,  a  message  routing 
system  is  needed  in  order  to  permit  indirectly  interconnected  transputers  to 
communicate  to  each  other.  This  will  create  some  overhead  for  the  system  and 
Cordeiro  [Ref  6],  addresses  this  point  presenting  a  design  and  implementation  for  such 
a  system. 

It  is  up  to  the  programmer  to  decide  which  process  should  be  placed  in  which 
processor.  For  efficiency  purposes,  it  is  recommended  to  place  frequently 
communicating  processes  in  adjacent  transputers  (  directly  connected  by  a  link  ). 

It  is  still  possible  however,  to  adapt  previously  designed  systems,  to  this  new 
architecture  and  develop  a  systems  architecture  where  a  central  data  base  would  be 
managed  by  a  central  transputer,  which  would  address  a  large  memory  that  could  be 
read  or  written  by  the  processors  connected  to  the  four  links,  as  depicted  in  Figure  1.2  . 
But  this  would  involve  further  study  and  it  is  not  in  the  scope  of  this  thesis. 

Another  point  worthy  to  mention  is  that  although  this  work  was  developed  using 
OCCAM,  there  is  already  available  a  C^  compiler,  and  coming  soon  a  Pascal  and  a 
Fortran  compiler.  The  Ada  compiler  for  this  machine  is  under  development  and 
according  to  INMOS  representatives,  it  will  be  released  before  the  end  of  1987. 

This  way  the  final  goal  of  the  AEGIS  project,  that  is  to  research  alternative  ways 
of  implementing  the  AEGIS  system  will  have  the  DoD  language  available  on  the 
present  machine. 


•^The  C  programming  language  compiler  generates  code  for  the  transputer 
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Figure  1.1     Block.  Diagram  of  Transputer  Architecture. 

It  is  also  pertinent  to  mention  at  this  point  that  in  the  last  Occam  User  Group 
meeting,  that  took  place  in  Santa  Clara,  CA,  in  March  lOth,  1987  there  were 
representatives  of  IBM,  Tektronix  and  other  major  corporations  showing  to  the 
participants,  work  in  development  and  developed  by  them,  using  the  Transputer. 

C.   THE  OCCAM  PROGRAMMING  LANGUAGE 

Occam  is  a  programming  language  that  since  its  first  release  in  1983  is  known  as 
very  suitable  for  description  of  muhiplc  processor  systems  [Ref  10],  due  to  the 
simplicity  with  concurrency  and  parallelism  can  be  addressed  [Ref  1 1]. 

In  fact,  since  then,  the  language  has  been  modified  and  enhanced  in  its 
capabilities,  and  one  of  the  latest  versions,  known   as  Occam  2  is  described  in  the  book 
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Figure  1.2    System  Using  a  Transputer  as  Memory'. 

by  Fountain  [Ref.  12].    However,  this  thesis  was  developed  using  one  of  the  primitive 
versions   of  the  language  called   Proto-Occam  '^   that  is   best  described   in  the   Occam 
Programming  Manual  [Ref.  13:  section  3],  with  slight  modifications  introduced  by  the 
OPS  'IDS  compilers  implementations  described  in  detail  in  the  Occam  Implementation 
section  in  the  OPS  Manual  [Ref.  13:  section  4]. 

The  goal  of  this  section  is  to  address  briefly  the  primitive  processes  and 
constructs  used  in  Occam  (Proto-Occam),  calling  attention  to  the  limitations  and 
capabilities  this  version  of  the  language  has.  Appendix  D  presents  some  hints  for 
programming  the  transputer  T-4I4  using  Proto-Occam. 

1.   Primitives 

a.    The  Channel 

The  channel  (CHAN)  is  an  identifier  used  for  performing  communications 
between  concurrent  processes  (if  in  the  same  processor)  or  processes  executed  in 
parallel  (if  in  dificrent  processors).  We  can  think  of  the  channels  as  a  pipe  that 
connects  horizontally  two  processes  that  are  being  executed  concurrently  or  in  parallel. 


■^Proto-Occam  is  so  called  in  the  Occam  Programming  Manual,  but  sometimes  it 
is  also  referenced  as  being  Occam  1  . 
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If  the  processes  are  in  the  same  processor  (same  transputer),  this  is  done  through  a 
specified  memory  location  determined  at  compile  time,  as  if  it  were  a  global  variable; 
but  if  the  communicating  processes  are  in  different  transputers,  the  channel  uses  the 
physical  links  connecting  the  transputers.  Any  type  of  variable  may  flow  through  the 
channel,  but  the  programmer  must  ensure  that  the  type  being  transmitted  is  the  same 
that  is  being  expected  at  the  receiver,  or  the  compiler  will  flag  an  error. 
This  is  the  basic  for  the  primitives  input  and  output: 

•  chanin  ?  char  -  This  can  be  read  as  the  variable  "char"  will  receive  a  value  that 
is  coming  from  elsewhere  through  the  channel  chanin. 

•  chanout  !  5  -  This  can  be  read  as  the  constant  "5"  is  being  output  to  another 
process  through  the  channel  chanout. 

This  implies  that  somewhere  in  our  transputer  network  there  will  be  a 

process  that  is  transmitting  some  value  through  the  channel  "chanin"  and  another  (or 

may  be  the  same)  process  is  receiving  into  some  variable  the  value  "5"  through  a 

channel  called  chanout. 

2.  Constructs 

Occam  has  six  basic  constructs: 

a      the  sequential  (SEQ)  construct 

b      the  parallel  (PAR)  construct 

c      the  alternative  (ALT)  construct 

d      the  conditional  (IF)  construct 

e      the  repetitive  (WHILE)  construct 

f      the  replicators  (FOR)  construct. 

The  sequential,  conditional  and  repetitive  constructs  have  the  same  usage  as  in 

many  other  structured  languages.^  It  is  interesting  to  note  the  necessity  of  having  a 

SEQ  construct,  because  normally  in  such  languages  this  is  the  only  way  to  execute  a 

program. 

a.    The  PAR  Construct 

A  parallel  construct  causes  its   component  processes  to   be  executed  in 

parallel,  if  the  component  processes  reside  on  different  transputers,  or  concurrently  in  a 

time  shared  fashion,  if  they  reside  on  the  same  processor  [Ref  13:  section  3,  item  3.4.2]. 

Note  from  Figure  1.3  that: 

•  Process  one  and  process  two  are  different  processes. 

•  Occam  is  fixed  format  and  indentations  are  always  2  spaces  for  nesting. 


^Like  Pascal,  Ada  or  C  programming  languages. 
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c2  :   channel 

declarations 

PAR 

WHILE  TRUE 

process  one 

VAR  X  : 
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cl  ?  X 

comms  ! 

X 

end  process 

one 

WHILE  TRUE 

process  two 

VAR  y  : 

SEQ 

comms  ? 

Y 

c2  !  y 

end  process 

two 

Figure  1.3     Example  of  a  PAR  Construct. 

•  There  are  no  begins  or  ends  to  delimit  processes. 

•  We  can  declare  variables  an\'U'here  in  the  code  as  long  as  it  is  before  the 
beginning  of  the  process  that  will  refer  to  it. 

•  Three  dashes  (— )  are  the  indication  for  comments  following  them. 

(1)  The  PRI  PAR  Construct.  The  priority  parallel  construct,  a  variation 
of  the  PAR  construct,  permits  at  most  two  processes  under  it.  The  first  one  will  be 
given  priority  0  (high),  and  the  second  one  will  be  given  priority  1  (low).  This  maps 
exactly  to  the  two  priority  levels  that  the  chip  supports.  As  the  Reference  Manual 
[Ref  7:  p.  3],  says,  the  priority  process  is  expected  to  be  executing  for  a  short  period  of 
time  because  when  it  begins,  it  can  not  be  preempted. 
b.    The  ALT  Construct 

An  alternative  construct  is  used  to  accept  the  first  message  available  from  a 
number  of  input  channels  [Ref  13:  section3,item  3.4.3].    See  Figure  1.4  . 


CHAN  cl. 

C2  : 

WHILE  T^UE                                                1 

VAR  X  : 

ALT 

Cl  ? 

X 

c3 

!  X 

c2  ? 

X 

c3 

!  X 

Figure  1.4     Example  of  an  ALT  Construct. 
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Note  from  Figure  1.4  that: 

•  We  could  have  any  number  of  channels  under  the  ALT  and  all  of  them 
outputing  to  c3.  This  is  a  construct  that  provides  mutual  exclusion  in  two  lines 
of  code. 

•  All  variable  declarations  arc  separated  by  commas  and  terminated  by  a  colon. 

There  is  also  a  variation  of  the  ALT  construct  named  PRI  ALT,  that 
enables  the  first  option  of  the  ALT  be  executed  in  precedence  to  the  others. 
c.    Replicators 

A  rephcator  may  be  used  with  a  construct  SEQ,  PAR,  ALT  or  IF  to 
replicate  the  process  a  number  of  times  [Ref  13:  section  3,  item  3.4.6]: 

•  SEQ  -  When  used  with  SEQ  it  provides  a  conventional  loop. 

•  PAR  -  When  used  with  a  PAR  it  makes  an  array  of  concurrent  processes  See 
Figure  1.5 

•  ALT  -  When  used  with  ALT  it  enables  to  receive  one  unique  input  at  a  time 
from  an  array  of  channels.  See  Figure  1.6  . 


CHAN  c[n+l]  : 

PAR  i  =  [ 0  FOR  n] 

WHILE  TRUE 

VAR  X  : 

SEO 

c 

,i   ?  X 

c 

[i  +  1]  !  X 

Figure  1.5     Replicated  PAR. 

3.  Good  Features  of  Proto-Occam 

Proto-Occam  has  some  nice  features  like: 
the  facility  in  handling  time  for  performance  evaluation  (TIME  ?  var) 
the  use  of  time  delay  {TIME  ?  AFTER  sometime)  for  real-time  applications 
the  SKIP  that  has  numerous  applications  and  help  to  handle  exceptions 
we  can  access  the  byte  in  memor>' 
there  is  no  need  to  declare  count  variables  used  in  replicators 


^Mutual  exclusion  is  one  of  the  critical  issues  in  Operating  System  design  [Ref  3] 
and  it  is  neatly  handled  by  the  ALT  construct. 
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Figure  1.6     Replicated  ALT. 

we  can  have  procedures  with  formal  parameters  being  arrays  of  variable  size; 
this  way  the  actual  parameters  may  be  of  different  sizes  in  different  procedure 

calls. 

4.  Proto-Occam  Limitations 

Many  of  the  limitations  of  Proto-Occam  have  been  fixed  by  Occam  2,  but 

they  are  still  note  worthy: 

there  are  only  one  dimensional  arrays 

there  are  no  types;  the  programmer  has  to  establish  a  convention  to  use  its 
variable  names  and  make  sure  to  address  them  coherently. 

no  floating  point  is  available 

no  recursion  is  permitted 

no  pointers  are  available 

D.       THE  ENVIRONMENT  AT  THE  NPS 

1.  Software  Facilities 

The  Naval  Postgraduate  School  has  several  Software  tools  available  in  its 
computer  labs: 

•  Occam  Programming  System  (OPS),  available  in  the  VAX/ VMS.  It  permits 
editing,  compiling,  linking  and  running  on  the  VAX,  concurrent  programs 
written  in  Occam,  simulating  a  network  of  transputers.  It  will  be  briefly 
described  in  Appendix  B,  but  the  reader  may  refer  to  [Ref  13]. 

•  Transputer  Development  System  (TDS  D600),  available  for  the  VAX/ VMS,  it 
edits,  compiles  and  down  loads  the  code  into  the  transputer  network.  It  will  be 
briefly  described  in  Appendix  C,  but  additional  information  may  be  obtained  in 
[Ref.  14]. 

•  Transputer  Development  System  (TDS  D701),  available  for  PC-AT  type  micro- 
computers. It  edits,  compiles,  links,  and  down  loads  to  the  transputer  network 
the  code  to  be  executed  (that  was  generated  on  the  PC).  It  is  single  user  and 
requires   installation   of  the   B004''   board  in  the   PC.    It   uses   the   Occam  2 


'Described  in  Table  3. 
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prograrmning  language.  This  system  arrived  at  the  lab  at  a  point  in  time  that 
this  thesis  was  already  partially  written  and  so  it  will  not  be  addressed.  For 
more  information  refer  to  [Ref  15]. 

2.  Hardware  Facilities 

a.    Transputer  Boards 

The  transputer  lab  has  a  Transputer  Evaluation  Module  with  seventeen 
(17)  transputers  in  the  following  configuration: 

•  one  board  with  one  transputer  (T414-12)   called  BOOl   [Ref  16],  that  is  the 
interface  with  the  VAX/VMS. 

•  four  boards  with  four  transputers  (T414-15)  each  called  B003  [Ref  17],  that  can 
be  used  either  with  the  VAX  or  with  the  PC. 

It  also  has  one  board  with  one  transputer  (T414-15)  called  B004  [Ref  18], 
that  is  the  interface  with  the  PC,  and  is  located  in  one  of  the  slots  of  the  Zenith  Z-248. 
This  makes  a  total  of  18  transputers  to  work  with. 

Table  3  hsts  its  characteristics. 
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Figure  1.7    The  Four  Transputers  in  the  B003  Board  -  Fixed  Links. 

These  transputers  can  be  interconnected  and  configured  in  any  way 
designed  by  the  programmer  using  the  INMOS  links  as  long  as  the  hard  wired  board 
connections  between  transputers  (that  already  exits  and  are  fixed  in  all  B003  boards  in 
the  LAB)  are  respected  [Ref  17].    See  Figure  1.7  . 


24 


TABLE  3 
CHARACTERISTICS  OF  BOARDS  BOOl,  B003  AND  B004 

a.  BOOl  Board 

-  One  IMS  T  414  -  12  mhz  transputer 

-  10  mbits/sec  INMOS  link  transmission  speed 

-  64  kbytes  of  static  RAM  [32  x  IMS  1400-45) 

-  128  kbytes  EPROM  (4  x  27256)  containing  : 

.bootstrap  loader, 

. memory  test, 

. terminal  to  host  transparent  mode  software 

-  2  RS/232  serial  input/output  connectors  for  : 

. VAX  connection 

. Terminal  connection 

-  64  way  DIN  connector  for  external  link  connections 

b.  B003  Board 

-  4  IMS  T  414-15  mhz  transputers 

-  10  or  20  mbits/sec  INMOS  link  transmission  speed 

-  256  kbytes  dynamic  RAM  per  transputer 

-  96  way  DIN  connector  for  external  link  connections 

c.  B004  IBM-PC  Add-in-Board 

-  one  T  414-15  mhz  transputer 

-  10  mbits/sec  INMOS  link  transmission  speed 

-  2  mbytes  dynamic  RAM  with  parity 

-  62  pm  I/O  channel  connector 


The  BOOl   board  is  the  interface  between  the  VAX  and  the  transputer 

network.   The  interconnection  is  done  through  standard  RS  232. 

The  user  can  develop  OCCAM   programs  on  the  VAX.  debug  and  test 

using  the  OPS,  and  when  ready,  down  load  them  to  be  run  on  the  transputers.  See 

Appendix  D. 

b.    Host  Computers  and  Terminals 

(1)    VAX.     To  use  any  of  the  systems  (TDS  or  OPS)  on  the  VAX,  the  user 

must  log  in  from  any  VT  100  or  VT  220  terminal  (  this  last  one  has  to  be  in  VT  100 

mode,  and  VT  100  id  ).  Appendix  A  presents  a  detailed  sequence  for  this. 

To  be  able  to  down  load  the  executable  code,  the  terminal  must  be  also 

connected  to  it.  There  are  two  ways  of  doing  it  [Ref  16,]  and  Figure  1.8  shows  how 

this  is  done  at  the  \PS  lab. 

The  following  advantages  should  be  pointed  out: 

•  The  VAX  provides  us  with  the  VMS  Operating  System  and  all  the  facilities  a 
mini-computer  can  support,  mainly  a  weekly  system  backup  that  we  do  not 
have  to  worry  about. 
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Figure  1.8    System  Interconnections  VAX-Transputers-Terminal. 
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•  If  a  modem  is  available,  much  of  the  work  can  be  done  from  home. 

•  Multi-user  utilization  as  far  as  using  the  OPS  and  editing  and  compiling  in  the 
TDS  (Very  handy  for  class  projects). 

•  Occam  2  will  be  available  soon  in  the  VAX  at  the  NFS,  as  an  upgrade  of  the 
OPS  and  TDS  systems. 

The  only  disadvantage  is  that  when  the  VAX  is  down  for  backup, 

upgrades  or  repairs,  there  is  nothing  the  user  can  do  about  it. 

(2)    Zenith   Z-248.     The  TDS  system  for  the  PC  is  completely  independent 

from  the  VAX.    It  has  a  new  version  of  OCCAM  more  powerful  and  flexible.  It  is 

installed  in  a  Zenith  Z-24S  micro-computer  (PC-AT  compatible),  with  2.5  mbytes  of 

KAM  and  8  MHz  clock. 

There  are  two  advantages  in  having  a  PC 

first  the  user  has  the  whole  micro  for  him  and  no  problems,  except  a  TDS  system 

failure,  would  delay  any  project.  An  assumption  is  made  that  to  replace  a  PC  is  an  easy 

task.    Secondly,  the  Occam  version  running  on  the  PC  is  temporarily^  newer  than  the 

one  on  the  VAX,  and  new  horizons  are  opened  for  research. 

As  mentioned  before,  this  thesis  was  developed  on  the  TDS  and  OPS 

installed  on  the  VAX  and  it  will  not  have  any  other  information  on  the  PC  based 

system. 

c.    Printing  Facilities 

There  are  two  ways  to  print  OCCAM  programs  developed  on  the  VAX: 

•  Using  the  VAX  /  VMS  online  printer  (only  files  with  extensions  ".1st"  and  ".lis" 
are  printable). 

•  Using  the  printer  at  the  lab  and  the  print  screen  facility  provided  by  the  VT  220 
terminals.  Anything  that  is  on  the  screen  can  be  printed  this  way,  and  this 
turned  out  to  be  one  of  the  best  debugging  and  analyzing  tools  for  the  research. 

E.       STRUCTURE  OF  THE  THESIS 

This  thesis  is  presented  in  5  Chapters  and  7  Appendixes. 

Chapter  I  was  the  introduction  to  Occam,  the  transputer  and  the  NTS 
environment.  Chapter  II  describes  the  terminology,  the  INMOS  Links,  the  methods 
used  for  performance  evaluation,  and  state  the  expected  results  and  research  questions. 

Chapter  III  and  Chapter  VI  address  each  one  of  the  research  questions, 
describing  the  experiments  done  and  presenting  the  results  obtained  and  conclusions 
reached  thereto.  Chapter  V  summarizes  the  conclusions  and  suggests  future  research. 


^The  Occam  2  version  for  the  VAX/ VMS  will  be  available  at  any  moment. 
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As  mentioned  already,  Appendix  A  presents  a  Learning  Sequence  for  how  to 
work  with  the  transputers  and  Occam,  having  the  VAX/ VMS  System  as  a  host. 
Appendix  B  and  Appendix  C,  are  tutorials  about  the  software  tools  available  presently 
for  the  VAX  ,  the  OPS  and  the  TDS  systems. 

Appendix  D  presents  some  hints  in  how  to  program  in  Occam,  and  call  attention 
for  some  mistakes  that  most  likely  one  will  make  when  using  this  new  language  on  a 
new  system,  with  a  different  and  powerful  fold  editor. 

Appendix  E  lists  the  Link  Evaluation  Program  used,  and  Appendix  F  lists  the 
Test  Linearity  Program,  both  with  all  procedures  and  library  routines  that  were  used. 
When  reading  the  listing  files  take  into  account  that  : 

•  Occam  is  a  fixed  indentation  language  with  two  spaces  between  each  nested 

level. 

•  Two  dashes  (— )  marks  the  begining  of  new  folder  with  the  title  aside. 

•  Three  dashes  (— )  means  that  comments  follow  on  that  line  only. 
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II.  COiMMUNICATION  AND  PERFORMANCE  ISSUES 

A.       COMMUNICATION  ISSUES 

The  purpose  of  this  section  is  to  set  the  stage  and  define  a  series  of 
communication  terms  that  will  be  used  in  the  following  discussion  about  the  transputer 
physical  links  performance. 

1.  Definitions: 

•  frame  -  it  is  a  packet  of  bits  containing  8  bits  plus  the  frame  protocol  bits  (  e.g. 
start  bit.  stop  bit.  and  parity  bit). 

•  bit  rate  -  it  is  the  number  of  bits  that  can  be  transmitted  in  a  unit  of  time  (  e.g. 
kbits  sec  or  mbits  sec). 

•  baud  rate  -  is  the  number  of  signal  elements  transmitted  per  second.  If  there  are 
only  two  signal  elements  (0  and  1)  then  the  baud  rate  is  equal  to  the  bit  rate. As 
this  is  the  case  on  the  transputer  we  will  mostly  refer  to  bit  rate. 

•  data  rate  -  It  is  the  number  of  data  elements  (bytes)  transmitted  per  unit  of 
time.  Normally  it  is  expressed  in  Bytes  per  Second.  It  is  always  smaller  than  the 
bit  rate  divided  by  8,  due  to  the  control  bits  needed  in  each  frame. 

•  net  bit  rate  -  (or  transfer  rate)  will  be  defined  by  the  author  as  8  times  the  data 
rate.  This  was  used  to  make  comparisons  to  the  values  advertised. 

2.  Data  Transmission  Basics 

a.  Modes  of  Operation 

•  parallel  transfer  mode:  when  multiple  wires  are  used  between  the  two 
equipments  ,  each  one  of  them  carrying  one  bit  of  the  frame. 

•  bit  serial  transmission:  when  only  one  wire  is  used  to  send  the  frame,  one  bit 
after  the  other. 

b.  Communication  Modes 

•  simplex  :  when  data  is  being  transmitted  in  one  direction  only. 

•  half  duplex:  when  data  is  being  transmitted  in  both  directions  but  alternately 
(switching  between  transmit  and  receive  mode  is  necessary). 

•  duplex  -  (or  full  duplex)  when  data  is  being  exchanged  in  both  directions 
simultaneously. 

c.  Transmission  Modes 

•  Asynchronous  Transmission  -  when  the  receiver  and  transmitter  clocks  are 
independent.  Each  frame  received  reinitializes  the  clock,  as  the  start  bit  is 
received.  It  is  used  when  the  rate  at  which  characters  are  generated  is 
indetermined  and  hence  the  transmission  line  can  be  idle  for  long  periods  in 
between  each  transmitted  character. 
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•  Synchronous  Transmission  -  When  receiver  and  transmitter  clock  are  dependent 
and  information  is  packed  in  long  streams  of  characters  instead  of  byte  by  byte. 
Use  special  synchronizing  bytes  before  each  block. 

Most  of  the  information  contained  in  this  section  was  taken  from  [Ref  19], 

and  it  is  just  included  here  to  make  the  reading  smoother. 

B.       THE  TRANSPUTER  LINKS 

"The  transputer  architecture  simplifies  system  design  by  using  point  to  point 
communication  links.  Every  member  of  the  transputer  family  has  one  or  more 
standard  links,  each  of  which  can  be  connected  to  a  link  of  some  other 
component.  This  allows  transputer  networks  of  arbitrary  size  and  topology  to  be 
constructed." 

This  quotation  extracted  from  [Ref  7:  p. 6],  gives  us  a  macro  sense  of  what  the 
link  is  and  how  it  can  be  beneficial  for  the  programmer.  Following  the  terms  described 
in  the  previous  section,  we  can  say  that  the  transputer  hnks  are  serial,  full  duplex, 
asynchronous  communication  devices  that  have  a  bit  rate  of  10  mbits/sec  or  20 
mbits'sec  (when  available).  They  provide  synchronization  between  communicating 
processes  on  a  transputer  network. 

To  provide  the  reader  with  a  better  understanding,  the  following  includes  some 
details  about  the  links,  extracted  from  [Ref  7:  p. 7]: 

•  Each  physical  link  provides  two  Occam  channels,  one  in  each  direction(input 
and  output).  The  T-414  has  four(4)  links,  so  we  have  8  physical  channels  for 
programming  purposes  in  each  transputer. 

•  Communication  via  any  link  may  occur  concurrently  with  communication  on  all 
other  links  and  with  program  execution. 

•  Synchronization  of  processes  at  each  end  of  a  link  is  automatic  and  requires  no 
explicit  programming.  This  is  one  of  the  important  features  one  can  use  with 
the  transputer.  The  hnks  are  the  concurrency  tools  and  are  very  easy  to 
program  by  using  the  Occam  channels. 

•  The  information  is  transmitted  on  the  link  in  the  format  depicted  by  Figure  2.1  , 
u'here  the  two  beginning  "I"  are  start  bits  and  the  ending  "0"  is  the  stop  bit. 

•  After  transmitting  a  data  frame  (one  byte),  the  sending  transputer  waits  for  an 
acknowledge  (ACK)  from  the  receiving  transputer,  signifying  that  the  byte  was 
received  and  it  the  link  is  ready  to  receive  another  byte.  If  the  ACK  is  not 
received  the  communications  on  that  link  will  stop. 

It  is  still  worth  mentioning  one  of  the  questions  we  had  about  how  they  work: 

•  "How  could  a  process  waiting  for  communication  waste  no  cpu  cycles? 
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1  1  1 

1    1     1    1    1    i    1    1  0 

start  bits 

information  bits           | stop  bit 

a.  The  DATA  frame 

b.  The  Acknowledge  frame 

1  1  0 

1 

Figure  2.1     The  Data  and  Acknowledge  Frames. 

The  information  we  have  got  verbally  from  Mr.  Neil  Mitchell  from  INMOS 
office  in  Santa  Clara  was  that  the  links  have  a  1  byte  buffer  inside  it.  When  a  process 
has  to  transmit,  the  first  byte  of  the  message  is,  in  fact  transmitted  and  it  is  received  by 
the  link  on  the  receiving  transputer,  and  stored  in  this  buffer.  Two  situations  may 
occur  then: 

1  If  that  receiving  link  is  already  waiting  for  an  input,  the  acknowledgement  is 
sent  right  away  to  the  transmitter  by  the  other  channel,  and  this  is  all  the 
transmitter  needs  to  follow  on  with  the  message. 

2  If  that  receiving  link  is  not  waiting  for  any  input  yet,  the  acknowledgement  is 
not  sent.  What  happens  in  the  sending  transputer  is  that,  as  the  ACK  does  not 
arrive,  the  process  is  placed  on  the  wait  queue,  and  a  pointer  to  that  process  is 
placed  in  the  respective  channel  memory  location  (each  channel  has  its  own) 
until  the  ACK  arrives.  When  this  happens,  the  process  is  awakened  and  the 
message  is  then  transmitted. 


*o^ 


C.       EXPECTED  RESULTS 

Assuming  we  have  a  bit  rate  of  10  mbits/sec  and  the  frames  have  no  delay 
between  them,  two  situations  should  be  considered: 

•  One  Channel  Transmitting  -  when  only  one  of  the  channels  is  being  used  for 
transmission  (or  reception)  of  messages  at  a  time,  and  so  the  other  channel  is 
free  to  bring  the  ACK  frames. 

•  Both  Channels  Transmitting/Receiving  -  when  we  have  message  passing  in  both 
channels  at  the  same  time  and  so  the  ACK  for  a  received  frame  is  piggy-backed 
(appended  to  the  end  of  the  frame)  [Ref  19:  p.  129],  to  the  next  transmitting 
frame. 
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1.  One  Channel  Transmitting 

In  this  case,  there  is  no  ACK  sharing  time  with  the  frame  on  the  channel  and 
we  will  get  the  maximum  rate  possible  as  follows: 

•  Net  bit  rate  =  (8;  11)  *  10  (mbits/sec)  =  7.27  mbits/sec  or  7,273  kbits/sec. 
Where  8  is  the  number  of  information  bits  and  1 1  the  total  number  of  bits  in  a 
frame. 

•  Data  rate  =  7.27  ,'  8  =  0.91  mbytes/sec  or  909  kbytes/sec. 

2.  Both  Channels  Transmitting/Receiving 

In  this  case  we  will  have: 

<•  Net  bit  rate=  (8/13)  '•■  10  (mbits/sec)=  6.15  mbits/sec  or  6,154  kbits/sec;  where 
13  stands  for  the  11  frame  bits  plus  2  ACK  bits  that  are  now  sharmg  the  link 
also. 

•  Data  rate  =  6.15  /  8  =  0.77  mbytes/sec  or  769  kbytes/sec. 

The  results  are  summarized  for  10  and  20  mbits'sec  rates  in  Table  4  . 


TABLE  4 

EXPECTED  MAXIMUM  TRANSFER  RATES  ON  THE  TRANSPUTER 

LINKS 

link  bit  rate    10  20         mbits/sec 

One  channel      1,213  14,545       kbits/sec 

Both  channels    6,154       12,308       kbits/sec 


The  reason  for  mentioning  the  values  in  kbits/sec  is  due  to  the  non-availability 
of  floating  point  and  this  way,  to  get  some  precision,  we  needed  to  use  this  unit  in  all 
performance  measurements  during  the  evaluation. 

It  is  worth  mentioning  that  these  values  were  expected  for  either  one  single 
channel,  or  the  eight  channels  operating  in  parallel  because  the  memory  is  multi-ported 
and  permits  access  to  each  one  of  the  links  and  the  processor  in  an  interleaved  mode. 
[Ref  7:  section  2,  p.l].  It  was  also  expected  that  these  rates  should  not  be  affected  by 
another  process  using  the  Central  Process  Unit  (CPU)  for  calculations  and  memory 
accesses  at  the  same  time,  for  the  same  reasons  mentioned  above. 

D.       RESEARCH  QUESTIONS 

From  the  above,  some  research  questions  could  be  devised  as  follows: 
1       Does  a  link  transmit  at  lOmbits/sec  and  20  mbits/sec  transfer  rate? 
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2  Is  the  ACK  really  transmitted  as  soon  as  the  receiver  channel  receives  the  first 
bit  of  the  data  packet? 

3  Is  the  communication  between  the  transputers  really  occurring  in  parallel? 

4  What  is  the  efTect  of  message  lengths  on  the  link  transfer  rates? 

5  What  is  the  mutual  effect  on  the  link  transfer  rates,  of  more  links  operating  in 
parallel  in  the  same  transputer? 

6  Can  the  CPU  work  in  parallel  with  all  the  links? 

7  What  IS  the  effect  of  a  communication  independent  process,  running  on  the 
CPU,  over  the  transfer  rates  obtained  in  a  link  by  another  process,  in  this 
transputer? 

8  What  is  the  effect  of  the  communications,  over  the  process  that  is  being 
executed  in  the  CPU? 

9  Does  the  Transputer  achieve  linear  performance  improvement? 

10       What  happens  when  a  process  is  time  sliced  in  the  middle  of  a  communication 
by  physical  link?  Does  the  link  stay  blocked? 

Questions   1   through  6  will  be  discussed  in  Chapter  III,  questions  6  to  9  in 

Chapter  IV.  Question  10  is  still  pending  and  is  left  for  further  research. 

E.       PERFORMANCE  MEASUREMENT  ISSUES 

As  mentioned  in  the  paper  by  Ceilar\'  [Ref  20],  there  are  five  methods  for 
computer  network  measurements,  depending  on  the  approach  used  for  data  gathering. 
They  are: 

•  Standard  User  Method, 

•  Reference  User  Method, 

•  Software  Monitoring  Method  (  Programs  ), 

•  Hardware  Monitoring  Method  (  Probed  Equipments),  and 

•  Hybrid  Monitoring  Method  (  A  mix  of  the  two  above). 

In  this  thesis  both  Software  and  Hardware  monitoring  methods  were  used  for  the 
following  reasons: 

•  The  hardware  monitors  are  more  reliable  than  the  software  monitors. 

•  For  statistics  purposes  and  for  large  amount  of  data,  some  times  it  is  impossible 
to  obtain,  using  hardware  measurements,  the  same  amount  o[  information  that 
can  be  collected  by  software  programs,  in  a  same  period  of  time. 

This  way,  we  used  hardware  monitors  to  confirm  preliminary  results  obtained  by 

software  and  after  validating  them,  a  massive  collection  of  data  was  gathered  to  permit 

and  back  up  the  conclusions  reached. 
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1.  Hardware  Methods 

Two  approaches  were  used: 

•  by  using  a  Oscilloscope  to  monitor  1  or  2  channels  of  a  link  at  the  same  time. 

•  by  using  a  Logic  Analyzer  to  monitor  4  and  8  channels  (in  4  difTerent  links)  of 
the  same  transputer. 

a.  Using  the  Oscilloscope 

The  idea  of  using  the  Oscilloscope  was  to  identify  on  the  screen  a  known 
pattern  of  bits  in  continuous  transmission,  and  also  to  obtain  an  approximation  of  the 
bit  rate.  Also  by  observing  subsequent  frames,  try  to  estimate  the  data  rate  and  the 
interval  between  frames.  Another  observation  that  could  be  made,  as  seen  in  the 
following  Chapter  and  also  documented  by  using  Polaroid  photographs,  is  the  relative 
position  of  the  Acknowledge  (ACK)  frame,  in  reference  to  the  transmitted  frame,  in  the 
second  oscilloscope  channel. 

The  equipment  used  was  the  Tektronix  364  Storage  Oscilloscope  and  the 
camera  was  the  Hewlett  Packard  HP-24A. 

b.  Using  the  Logic  State  Analyzer 

The  idea  of  using  the  Logic  Analyzer  was  to  monitor  several  channels  of  a 
same  transputer  and  really  see  if  there  were  bits  been  transmited  at  the  same  time,  in 
some  or  all  of  the  channels.  Our  Logic  Analyzer  has  the  capability  to  monitor  32 
channels  and  store  250  subsequent  bits  in  each  in  each  channel  after  triggered. 

As  all  channels  are  asynchronous,  an  external  clock  was  necessary  and  so  a 
Pulse  Generator  was  used  to  provide  this  clock.  To  help  in  getting  a  more  precise  clock 
a  Digital  Counter  was  also  used  to  sample  it.  The  equipments  used  were: 

•  Logic  State  Analyzer  Mod.  532  with  Analyzer  Probe  Model  51  A. 

•  20  mhz  Function/Pulse  Generator  Wavetek  Model  145. 
»      Measuring  System  Hewlett  Packard  model  HP-5300A. 

One  problem  arose  from  this: 

•  The  maximum  external  clock  frequency  acceptable  by  our  logic  analyzer  was  12 
mhz  and  as  recommended  by  Nyquist  relation,  we  should  have  a  sampling 
frequency  at  least  the  double  of  the  sampled  signal  (Normally  16  times  is  used) 
[Ref  19:  p.  15]. 

In  our  case,  the  sampled  signal  was  supposedly  at   10  mhz  and  so  a 

minimum  clock  of  20  mhz  should  be  used.  As  the  Logic  Analyzer  did  not  permit  that, 

we  used  a  lOmhz  pulse  instead  as  clock,  and,  by  trial  and  error  varying  the  clock 

frequency  and  pulse  width,  after  numerous  tentatives  we  obtained  some  representative 
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results  that  are  presented  in  the  following  chapter.  It  is  good  to  mention  that  we  did 
not  even  tr\'  to  monitor  the  links  running  at  20  Mhz  for  the  same  reason. 

c.    Test  Points 

To  monitor  the  links  activity,  a  homemade  monitoring  bridge  that  was  able 
to  connect  up  to  eight  channels  was  used  and,  with  it,  we  had  the  ability  to  monitor 
the  four  links  of  a  transputer. 
2.  Software  Methods 

With  this  respect,  several  programs  were  made  at  first  to  compare  the  rates 
obtained  in  hardware  with  the  ones  in  software,  and  for  the  final  report  on  the  links 
performance,  a  complete  Link  Evaluation  Program  was  designed,  to  handle  all  possible 
cases  of  constructs  to  communicate,  several  kinds  of  channel  parallelism  and  two 
different  cases  of  CPU  load,  concurrently  with  the  communications.  The  output  of  this 
program  was  a  table  of  values  that  was  used  to  generate  some  graphics  using  the 
EASYPLOT  system  at  the  IBM  3033.  Appendix  E  presents  a  listing  of  the  evaluation 
program  with  the  Occam  library'  used.  The  terminal  driver  is  the  one  provided  by 
INMOS,  with  the  Keyboard  and  Screen  references  made  using  the  first  letter  in 
uppercase,  and  therefore  is  not  included. 

The  library. occ  is  a  collection  of  previous  existing  procedures,  some  generated 
by  the  manufacturers  and  some  made  originally  for  the  OPS  System  by  previous 
workers,  updated  to  be  used  on  the  TDS,  plus  additional  procedures  for  i;o  and  utilities 
written  by  Cordeiro  and  myself  They  can  be  browsed  on  Appendix  E,  inside  the 
program  listing. 

To  observe  the  effect  of  multiple  transputer  execution  of  the  same  program,  a 
series  of  versions  of  Program  TEST  LINEARITY  were  made  and  the  17  transputer 
version  is  listed  in  Appendix  F. 

All  programs  above  used  basically  the  same  three  tools: 

•  -  The  TIME  channel  provided  by  the  compiler  and  Occam  to  read  the  internal 
transputer  clock  in  ticks.  Table  5  summarize  them. 

•  -  the  tick. to. time  procedure  used  to  convert  time  from  ticks  into  hours, 
minutes, seconds  and  milliseconds.  It  receives  as  input  parameters  the  "startime" 
(in  ticks),  the  "endtime"  (in  ticks)  and  the  transputer  type,  and  outputs  to  the 
screen  the  elapsed  time  in  hours,  minutes,  seconds  and  milliseconds,  for  the 
specified  transputer.   This  routine  is  listed  in  Appendix  F. 

•  -  the  transfer. rate  procedure  similar  to  the  previous  one  but  which  computes 
the  transfer  rate  measured  in  the  channel  observed.  It  receives  as  parameters 
"startime",   "endtime",   "transputer  type   nr.",   and  the   "size   of  the   message" 
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TABLE  5 
THE  DIFFERENT  TICK  VALUES 

T-414  12mhz >  1  tick  =  1.  5  micro-seconds 

T-414  15mhz  (high)  >  1  tick  =  1    micro-second 

T-414  ISmhz   ( low)  >  1  tick  =  64   micro-seconds 

VAX/VMS >  1  tick  =  100  nano-seconds 


transmitted  and  outputs  the  transfer  rate  through  the  variable  "rate".   This 
routine  is  listed  in  Appendix  E. 
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III.  THE  EVALUATION  STARTS 

A.       INTRODUCTION 

In  this  chapter  we  start  to  address  the  research  questions  related  to  the 
evaluation,  as  listed  in  Section  D  of  Chapter  II. 

Section  B  describes  how  we  verified  that  the  bit  rate  is  indeed,  10  mbits/sec  or  20 
mbits  sec.  It  also  shows  the  maximum  values  achieved  for  the  net  bit  rate  (transfer 
rate),  for  the  various  construct  types. 

Section  C  shows  the  configuration  used  and  demonstrates  that  the  transfers  in 
different  links  occur  in  parallel,  eventually  in  all  8  channels  of  the  4  links. 

Section  D  describes  the  message  size,  and  the  channel  parallelism  effects  on  the 
transfer  rates  for  the  various  constructs. 

1.  The  Available  Constructs 

Occam  permits  us  to  use  several  different  primitives  and  procedures  for 
communications  between  processes.  The  first  to  be  mentioned  are  the  input  and  output, 
already  explained  in  Chapter  I.  We  used  them  in  two  modes: 

•  transmitting  bytes  (characters),  or 

•  transmitting  words  (integers). 


BYTE. SLICE. OUTPUT 

(chanid,  buffername 

initbyte,  blocksize) 

where : 

-  chanid     -  the 

channel  name  where 

the  communication  will  occur 

-  buffername  -  the 

name  of  the  array  of  variables 

-  initbyte   -  the 

array  index  of  the 

first  byte  to  be  transferred 

-  blocksize  -  the 

number  of  bytes  to 

be  transferred 

Figure  3.1     The  BYTE  SLICE  OUTPUT  Procedure  Call. 
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The  third  mode  is  the  BYTE  SLICE  INPUT  and  BYTE  SLICE  OUTPUT 
procedure.  These  procedures  are  microcoded  subroutines  that  provide  a  block  transfer 
of  bytes.  Figure  3.1  shows  the  procedure  call  and  an  explanation  of  the  parameters 
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[Ref.  14:  section  4].  These  procedures  cannot  be  used  when  doing  programs  for  the 
OPS.  The  advantage  they  bring  us,  is  a  better  performance,  but  when  using  OPS  we 
are  not  concerned  about  it. 

The  last  mode  is  the  WORD  SLICE  INPUT  and  WORD  SLICE  OUTPUT 
procedure,  also  microcoded,  that  provide  block  transfer  o[  words.  As  just  mentioned 
above,  the  procedures  showed  to  be  much  faster  than  the  input/output  primitives,  but 
with  similar  performance  to  the  BYTE  SLICE  procedures. 
2.  Considerations  About  Memory  Management 

As  we  have  a  machine  with  internal  and  external  memory  with  different 
performances  and  address  capabilities,  this  was  a  major  concern,  as  far  as  performance 
could,  and  in  fact  is,  undoubtedly  affected.  The  documentation  is  not  clear  enough  to 
permit  us  to  assure  how  this  is  handled  by  the  processor,  in  the  bOOl  and  b003  boards. 
We  tried  to  check  the  addresses  mentioned  in  [Ref  7:  section  2,  pp.  5,7],  but  we  were 
not  able  to  verify  that. 

What  can  be  said,  though,  is  that  it  looks  like  the  memory  (internal  plus 
external)  on  the  BOOl  transputer  board  is  divided  into  four  memory  banks,  each  one  of 
them  beginning  at  addresses  0,  16k,  32k,  and  48k  ,  and  the  data  and  programs  are 
mapped  evenly  over  these  four  banks.  We  reached  this  conclusion  after  browsing 
several  listings  of  the  memory  contents  obtained  from  the  transputer  in  the  BOOl 
board,  through  a  "dump"  routine  designed  and  implemented  by  M.  Cordeiro,  also  part 
of  the  LIBRARY.OCC,  included  in  Appendix  F. 

In  our  evaluation  program  outputs,  we  tried  to  observe  any  noticeable  effects 
that  could  be  explained  by  a  fast  or  slowest  memory  access,  but  the  evidences  were  not 
strong  enough,  as  it  will  be  mentioned  further  on.  As  a  curiosity,  we  measured  the  time 
to  initialize  four  arrays  of  15,000  bytes  each  in  the  BOOl  memory  and  we  have  got  133 
msec!  We  assumed  that  programs  smaller  than  2k  bytes  long,  will  be  loaded  entirely 
into  internal  memory,  but  we  could  not  prove  it  and  this  is  left  and  strongly 
recommended  for  further  research. 

B.       A  CLOSE  LOOK  ON  THE  BIT  RATE 

The  evaluation  started  trying  to  answer  research  question  1  that  is  transcribed 
here: 

•      "Do  the  links  transmit  (and  receive)  data  at  10  and  20  mbits/sec  transfer  rates?" 
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1.  First  Software  Results 

To  find  that  out,  simple  programs  were  made  to  transmit  and  receive  long 
messages  (arrays)  through  the  physical  links.  The  transfer  rate  was  obtained  by  dividing 
the  number  of  bits  transmitted  by  the  time  spent  on  the  transmission.  A  flag  was  used 
(single  byte)  from  recei\'er  to  transmitter  to  assure  the  transmitter  would  only  transmit 
when  the  receiver  was  ready.  This  way,  we  would  be  timing  the  best  possible  case  with 
the  best  possible  accuracy.  The  basic  program  code  used  for  the  transmitter  and 
receiver  is  in  Figure  3.2  The  BYTE  SLICE  was  the  construct  used,  because  from  the 
very  first  tests  it  proved  to  be  the  fastest,  even  for  one  byte  being  transmitted. 

The  configuration  used  for  that  was  as  simple  as  it  could  be.  Two  transputers 
connected  by  a  link  hosting  one  procedure  transmitter  (TR.l)  and  one  procedure 
receiver  (TR.2).  Figure  3.3  depicts  that. 


SEQ 

chanl  ?  flag      flag  is  received 

TIME  ?  startime   time  is  stored  in  var  startime 

transfer  begins 

BYTE. SLICE. OUTPUT  (chan2,  buffername,  1,  block. size) 

transfer  ends 

TIME  ?  endtime    time  is  stored  in  var  endtime 

call  to  procedure  transfer  rate  outputs  the  rate. 

transfer .rate  (startime,  endtime,  transputer. type , 
blocksize,  rate) 

a)  Transmitter 

SEQ 

chanl  !  char  flag  is  sent  to  transmitter 

BYTE. SLICE. INPUT  (chan2,  buffername,  1,  block. size) 

b)  Receiver 


Figure  3.2     Basic  Code  for  Transmitter  and  Receiver. 

The  block  size  used  was  15,000  bytes,  in  order  to  avoid  possible  dragging 
effects  of  small  messages.  The  results  obtained  were  around  3,800  kbits/sec  with  an 
execution  time  of  31.5  msec,  average.  As  we  can  notice  from  table  4,  in  Chapter  III,  it 
was  almost  half  of  the  expected  value  of  7,273  kbits/ sec.  Why?  The  monitoring  of  the 
channels  with  the  oscilloscope  answered  this  question. 
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<:'  _   -_ 

chanl 

TR  1 

TR  2 

___«..______   __—  ^ 

chan2 

Figure  3.3     Configuration  for  Initial  Tests. 

a.   Links  at  20  mbitsjsec 

With  the  links  switched  to  20  mbits/sec,  we  could  only  have 
communications  between  transputers  located  on  B003  boards,  so,  although  the  code 
was  practically  the  same,  the  configuration  had  to  be  sUghtly  different.  Figure  3.4 
shows  us  how  it  was. 


Root 
(BOOl) 

chanS 

TR  1 
(BOOS) 

chanl 

TR  2 
(BOOS) 

^mm^^mm^^^^^ 

> 

chan2 

chan4 

Figure  3.4    Configuration  for  Measuring  Links  at  20  mbits/sec  Bit  Rate. 

The  results  obtained  for  block  sizes  of  15,000  bytes  using  also  the  BYTE 
SLICE  construct,  where  of  the  order  of  6.000  to  6,100  kbits/ sec,  again  ver}'  small,  if  we 
compare  them  with  the  expected  of  14,545  kbits/sec. 
2.  Using  the  Oscilloscope 

Another  simple  program  that  made  a  continuous  transfer  on  the  link,  made  it 
possible  to  observe  the  frame  transmitted  and  estimate  the  rate  on  the  oscilloscope 
screen.  The  message  transmitted,  using  BYTE  SLICE,  was  a  sequence  of  TRUES.  The 
TRUE,  in  Occam,  is  a  sequence  of  8  binary  I's  and  so  the  frame  was  as  Figure  3.5 
shows. 
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1  I  1  I  1  I  1  I  1  I  1  I  1  I  1  I  1  I  1  I  0 
start  bits  |     information  bits(TRUE)      | stop  bit 


Figure  3.5     Frame  Transmitted  for  Oscilloscope  Observations. 

The  basic  code  used  is  depicted  in  Figure  3.6  .  There  is  no  time  sampling  or 
flags  to  avoid  any  side  efiect  on  the  oscilloscope  screen.  Figure  3.7  shows  the  picture  of 
a  frame  like  the  one  on  figure  3.5  followed  by  an  acknowledge  (both  appear  on  the 
same  trace  due  to  vertical  mode  ADD  used  on  the  oscilloscope.  All  the  oscilloscope 
settings  are  also  mentioned  below  the  picture. 


WHILE  TRUE 

BYTE. SLICE. OUTPUT  (chan2,  bufferl,  1,  block. size) 

a)  code  on  the  transmiter 

WHILE  TRUE 

BYTE. SLICE. INPUT  (chan2,  buffer2,  1,  block. size) 

b)  code  on  the  receiver 


Figure  3.6    Example  Code  for  Oscilloscope  Observations. 

Note  from  Figure  3.7  that  the  10  "ones"  of  the  frame  occupy  5  divisions.  This 
sums  up  to  1  microsecond.  So  we  have  one  bit  per  0.1  microsecond  and  this  implies  a 
bit  rate  of  10  mbits/sec  (gross). 


Conclusion  1 

The  bit  rate  is  in  fact  10  mbits/sec, 

if  we  consider  only  one  frame. 
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Oscilloscope  Settings: 

channel  1     -->  shows  the  transmitted  frame 
channel  2     -->  shows  the  acknowledge 
time  scale    -->  0.2  microsec  /  division 
voltage  scale  ->  2  volts  /  division 
vertical  mode  ->  ADD 
trigger  source->  channel  1 


Figure  3.7     Picture  of  One  Frame  at  lOmbitssec  Rate. 

The  Acknowledge  appears  enlarged  due  to  the  lack  of  synchronization 
between  both  channels  and  the  trigger  source  to  be  oscilloscope  channel  1.  We  can't 
take  precise  measurements,  but  we  can  estimate  the  best  and  worst  cases: 

•  best  case  -  The  ACK  pulse  is  beginning  at  the  trailing  edge  (leftmost)  of  the 
ACK  frame.  This  will  give  us  a  distance  of  approximately  200  nsec  between  the 
last  bit  of  the  data  frame  and  the  acknowledge  frame  (remember  that  there  is  a 
"zero"  bit  after  the  last  "one"). 

•  worst  case  -  The  ACK  ends  at  the  leading  edge  (rightmost)  of  the  ACK  pulse. 
This  will  give  us  a  distance  of  approximately  300  nsec  instead. 


Conclusion  2 

The  ACK  frame  leaves  the  receiver  200  to  300  nsec 

after  the  transmitted  frame  arrived! 
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Oscilloscope  Settings: 

channel  1  -->  snows  the  transmitted  frame 
channel  2  -->  shows  the  acknowledge 
time  scale  -->  0.5  microsec  /  division 
vol~age  scale  ->  2  volts  /  division 
ver-cical  mode  ->  ADD 
trigger  source- >  channel  1 


Figure  3.S     Three  Data  Frames  at  lOmbits'sec  Rate. 


Increasing  the  time  scale  of  the  oscilloscope  to  0.5  microseconds,  we  could 
observe  more  fram.es  and  acknowledges  as  shown  in  Figure  3.8,  and  from  this  picture, 
using  the  same  best  and  worst  case  approach,  we  could  estimate  that  the  distance 
between  the  .■\CK  and  the  following  frame  (center)  is  between  500  and  600  nsec.  We 
could  .;lso  notice  that  the  distance  between  consecutive  data  frames  is  between  900  and 
1000  nanoseconds. 

So.  estimating  the  transfer  rate  from  the  picture,  assuming  all  frames  will  keep 
at  least  this  space  between  them,  we  got: 

•  best  case-  for  each  1100  nsec  information  we  have  900  of  line  inactive.    If  we 
multiply  this  ratio  by  the  expected  transfer  rate  of  7.273  kbits/sec,  we  get 

(1100  '  2000)  X  7273  =  4000  kbits/sec 

•  worst  case  -  then  1000  nsec  of  line  inactive  would  bring  us 

(1100  ,  2100)  X  7273  =  3809  kbits/sec 
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As  wc  can  see  the  hardware  results  were  confirrmng  the  previously  obtained 
software  results. 

.\:vo"her  fact  to  add  is  that,  during  our  observations,  the  frames  were  not 
always  equally  spaced  as  shown  in  Figure  3.8  In  fact,  this  figure  shows  the  most 
equally  spacea  results  we  ever  obtained.  Figure  3.9,  in  which  the  time  scale  was  once 
m.ore  increased,  to  1  microsec  per  division,  we  can  note  that  the  fifth  frame  in  the 
channel  at  the  bottom  is  more  spaced  than  the  four  previous  ones.  In  this  picture  the 
ALT  \crtical  mode  was  used  to  permit  us  to  see  the  ACK  on  the  upper  trace.  Note  the 
regularity  which  the  acknowledge  appears  200  to  300  nsec  after  the  received  frame. 


Oscilloscope  Settings: 

channel  1  -->  shows  the  transmitted 
channel  2  -->  shows  the  acknowledge 
time  scale  -->  1. 0  microsec  /  divisi 
voltage  scale  ->  2  volts  /  division 
vertical  mode  ->  ALT 
trigger  source->  channel  1( bottom) 

1 

frame 
(upper) 

on 

Figure  3.9     Five  Frames  Observed  at  10  mbits/sec  Rate. 
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a.   Snitching  the  Rate  to  20  mbitsjsec 

Similar  observations  were  made  for  the  links  operating  at  20  mbit.'sec  rate 
and  Figure  3.10  that  was  taken  with  time  scale  0.1  microsec  per  division  shows  the 
sam.e  lu  'ones"  of  Figure  3.7  in  approximately  0.5  microsecond,  that  is  half  of  the  time 
that  was  obtained  there.  The  ACK  now  is  in  oscilloscope  channel  i  and  is  the  trigger 
source  (this  is  the  reason  it  is  now  well  defined). 


Oscilloscope  Settings 


channel  1  -->  shows  the  acknowledge  (previous) 

channel  2  -->  shows  the  transmitted  rrame( next) 

time  scale  -->  0. 1  microsec  /  division 

voltage  scale  ->  2  volts  /  division 

vertical  mode  ->  ALT 

trigger  source- >  channel  1  (upper) 


Figure  3.10    One  Frame  and  the  ACK  at  20  mbits/sec  Rate. 

Note  the  time  delay  between  the  ACK  (upper  trace)  and  the  following 
frame  (lower  trace)  that  was  measured  as  about  400  nsec. 

Figure  3.11,  taken  with  time  scale  0.5  microseconds  per  division  shows  us  a 
series  of  "TRUE"  frames  at  20  mbits/sec  rate  and  the  ACKframes  in  the  same  trace. 
We  could  estimate  the  percent  of  time  the  link  is  actively  transmitting  as  around  40% 
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of  the  total  time  approximately.  If  we  take  40*^0  of  the  predicted  rate  of  14,545  we  get 
5,818  kbits  sec.  Comparing  this  with  the  software  obtained  value  of  6,100  kbits/sec,  we' 
can  conclude  that  the  value  is  reasonable  enough  for  an  explanation  of  the  software 
results. 


Oscilloscope  Settings: 

channel  1  -->  snows  the  acknowledge 
channel  2  -->  shows  the  transmitted  frame 
time  scale  -->  0.5  microsec  /  division 
voltage  scale  ->  2  volts  /  division 
vertical  mode  ->  ADD 
trigger  source->  channel  2 


Figure  3.11     Four  Frames  and  ACK  at  20  mbits/sec  Rate. 

3.  Comparison  Between  the  Constructs 

In  this  subsection  we  would  like  to  include  the  maximum  values  of  the  transfer 
rate  obtained,  ever  for  the  various  constructs.  They  are  summarized  on  Table  6  and 
were  obtained  using  different  programs,  with  different  message  sizes  and  so  they  are 
not  mentioned.  It  is  interesting  to  note  that  the  input  operation  has  a  slight  tendency 
to  be  quicker  than  the  output,  which  is  not  true.  This  occurs  because  of  the  flag 
positioning,  which  will  slightly  affect  the  rate,  but  the  rate  should  be  considered  as  the 
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same.  What  can  be  mentioned,  however,  is  that  [Ref.  7:  section  2,  pp. 26,27],  shows  us 
an  expected  performance  summary"  and  there  the  input  primitive  is  rated  as  usmg  26.5 
processor  cycles  while  the  output  would  take  26  cycles,  and  this  is  not  much  of  a 
diiTerence.  This  same  reference  still  mentions  that  the  values  are  not  defmitive  and 
may  sufTer  changes  as  more  information  is  collected. 


TABLE  6 
MAXIMUM  TRANSFER  RATES  OBTAINED  (KBITS/SEC) 


output 
input 


input/output 

Primitives 
es     words 


595 
631 


2412 


2855 


BYTE  SLICE 
procedure 


3880 
3804 


WORD  SLICE 
procedure 


3669 


3786 


Browsing  the  figures  on  Table  6,  one  question  comes  up  at  once: 
Why  is  the  transmission  using  the  input  and  output  primitives,  so  much 
slower  in  comparison  to  the  built-in  procedures?" 

For  the  byte  transmission  case,  using  the  primitives,  if  we  look  at  Figure  3.12, 
we  Vvill  see  how  an  array  of  "TRUES"  is  transmitted  through  the  link,  at  10  mbits,  sec 
selected  bit  rate.  The  information  seems  to  be  stored  one  byte  per  word  and  this  way, 
for  each  "TRUE"  byte,  three  empty  frames  follows.  Note  that  the  frames  carry  only  the 
start  bits  (two  "ones").  The  time  between  frames  containing  information,  measured  at 
the  lab  was  13  microseconds. 

For  the  word  (integer)  transmission  case,  if  we  browse  Figure  3.13,  we  see  a 
similar  pattern  to  Figure  3.12  but  with  the  difference  that  all  frames  are  effectively 
carrying  information  bits.  The  information  used  to  ease  the  observation  was  maxint, 
which  is,  for  our  32  bit  machine  2,147,483,647  decimal  or  "  7FFF  "  hexadecimal.^  The 
elapsed  time  measured  at  the  lab  between  the  acknowledge  of  the  last  byte  of  the  first 
word  and  the  first  byte  of  the  second  word  was  around  5  microseconds.  By  doing  same 
calculations  done  for  the  BYTE  SLICE  case  one  will  conclude  that  the  maximum 
values  obtained  are  in  accordance  with  the  observations  on  the  oscilloscope. 


^The  transputer  T  414  uses  signed  integers  in  the  range  -  2,147,483,648  to 
2,147,483,647  decimal  or  8000  to  7FFF  hexadecimal,  respectively  [Ref  7:  section  2,  p. 
2]. 
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Oscilloscope  Settings: 

channel  1  -->  snows  transmitted  frame  (upper) 
channel  2  -->  shows  the  acknowledge  ( lower) 
time  scale  -->  2. 0  microsec  /  division 
voltage  scale  ->  2  volts  /  division 
vertical  mode  ->  ALT 
trigger  source->  channel  1 


Figure  3.12    TRUES  Transmitted  Using  the  Input/Output  Primitives. 

As  a  conclusion  of  this  section,  we  could  prove  that  the  software 
measurements  and  the  procedures  used  to  calculate  the  transfer  rate  were  producing 
reasonable  values,  that  agreed  with  those  observed  on  the  oscilloscope.  The  reason  we 
had  not  obtamed  the  expected  transfer  rates  was  because  the  link  is  not  continuously 
active  as  the  literature  led  us  to  believe,  and  there  is  a  considerable  delay  between  the 
receipt  of  a  frame  and  the  departure  of  the  corresponding  ACK.  Also,  after  the  ACK  is 
received  by  the  transmitter,  there  is  another  delay  to  transmit  the  next  frame. ^"^ 


^'^In  fact  during  the  Occam  User  Group  meeting  already  mentioned,  in  Santa 
Clara,  CA,  Mr.  Martin  Booth  from  INMOS  office  at  Santa  Clara  said  that  the  data 
rate  we  should  really  expect  on  the  links  was  450  kbytes/ sec,  what  agrees  with  our 
results  (  450  X  8  =  3800  ). 
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Oscilloscope  Settings: 

channel  1  -->  snows  transmitted  frames  (upper) 
channel  2  -->  shows  the  acknowledge  ( lower) 
time  scale  -->  2.  0  microsec  /  division 
voltage  scale  ->  2  volts  /  division 
vertical  mode  ->  ALT 
trigger  source->  channel  1 


Figure  3.13     Maxint  Transmitted  Using  the  Input/Output  Primitives. 

Although,  it  is  expected,  that  the  new  transputer  version,  the  T  800,  will  solve 
this  problem  by  permitting  the  acknowledge  leave  the  receptor,  as  soon  as  the  first  bit 
of  the  fi-ame  arrives,  and  this  way  the  delay  would  not  exist,  or  at  least  be  smaller 
[Ref.  21]. 

C.       OBSERVING  PARALLEL  ACTIVITY  ON  THE  LINKS 

1 .  L'sing  Software 

To  observe  the  links  working  at  the  same  time,  we  needed  to  build  a  different 
configuration.  As  we  have  4  links  per  transputer,  we  needed  at  least  5  transputers  to 
make  all  links  work  in  parallel  at  the  highest  possible  rate.  The  configuration  used  is 
depicted  in  Figure  3.14  . 


49 


CHAN  cn 


CHAN   COJ 


ROOT 


CHAN   C3] 
H 


CHAN   C2] 


i 


2 

T 


0 

T 


CHAN  C5] 


CHAN  r4J 


CHAN  C7] 


CHAN  C6J 


*  3 


Figure  3.14     Configuration  to  Observe  the  Four  Links  Operating  in  Parallel. 

In  processor  root  we  placed  a  procedure  that  was  able  to  transmit  and/or 
receive  in  parallel  to/from  the  four  transputers  0,1,2  and  3.  Figure  3.15  shows  the 
Occam  code  used  to  do  that. 

Using  Figure  3.15  as  a  template,  one  can  implement  a  similar  code  to  transmit 
in  2  or  3  channels  in  parallel  just  by  removing  the  unwanted  BY'l'E  SLICE  procedure 
calls.  Note  that  the  channels  mentioned  on  Figure  3.15  are  in  accordance  with  the  ones 
on  I'igure  3.14  (1,3,5,7  arc  output  channels  for  transputer  root).  On  tlie  other  hand  one 
may  think  of  using  the  input  channels  at  the  same  time,  but  this  will  be  addressed  in 
next  section. 

The  receivers  in  their  turn  have  a  simpler  code  than  the  transmitter,  because 
each  one  of  them  is  only  communicating  with  transputer  root.    Figure  3.16  shows  it. 

If  one  tries  to  map  the  channels  of  Figure  3.16  to  the  configuration,  one  will 
notice  that  there  is  no  such  a  channel  in  or  out  on  Figure  3.14  and  to  clarify  that 
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PROC  transmitter  (CHAN  chanO ,  chan2 ,  chan4,  chan6, 

chanl ,  chanS,  chan5,  chan7)= 
. . .  declarations 
SEO 

. . .  buffers  initializations 
PAR 

chanO  ?  flagO     flags  are  received  from  each 

chan2  ?  flagl     of  the  receiving  transputers 

chan4  ?  flag2     and  only  after  all  of  them 

chan6  ?  flagS     are  ready  the  timer  is  started 

TIME  ?  startime 
PAR 

BYTE. SLICE. OUTPUT  (chanl,  bufferO,  1,  block. size' 
BYTE. SLICE. OUTPUT  (chan3,  bufferl,  1,  block. size' 
BYTE. SLICE. OUTPUT  (chanS,  buffer2,  1,  block. size' 
BYTE. SLICE. OUTPUT  (chanV,  buffers,  1,  block. size' 
TIME  ?  endtime 

transfer  rate  calculated  will  be  in  VAR  rate 

transfer .rate  (startime,  endtime,  1,  blocksize,  rate) 


Figure  3.15  Code  Used  to  Time  Transmission  Through  the  Four  Links  in  Parallel. 


PROC  receiver  (CHAN  in,  out)= 
. . .  declare  variables 
. . .  initialize  buffer 
SEQ 

out    !    flag 

BYTE. SLICE. INPUT    (in,   buffer,    1,   block. size) 


Figure  3.16    Code  for  the  Receivers. 

Figure  3.17  shows  how  the  configuration  would  be  actually  coded  for  these  processes 
to  be  mapped  and  work  properly. 

As  one  may  notice  from  Figure  3.17,  the  chanO  inside  the  procedure  refers  to 
the  chan[0]  on  the  configuration,  and  so  on.  We  could  think  of  chan[0]  being  the  actual 
parameter  and  chanO  being  the  correspondent  formal.  This  is  not  strictly  true,  because 
on  the  configuration  we  are  only  placing  the  procedure  on  the  processor,  not  calling  it, 
but  the  analogy  is  still  valid  and  the  names  were  chosen  to  make  it  easier  to 
understand.  The  users  and  programmers  may  use  any  name  for  channels,  and  in  fact 
we  used  some  different  ones  in  our  implementations.  The  importance  is  to  get  the  idea. 
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—  configuration 

DEF  root  =  100: assigning  a  number  to  root 

CHAN  chan[8] : channel  variables  for  physical  channels 

PLACED  PAR 

PROCESSOR  root 

placing  channel  names  on  physical  channels 

PLACE  Chan 

0 

AT  linkOin 

PLACE  Chan 

'l' 

AT  linkOout 

PLACE  Chan 

'2' 

AT  linklin 

PLACE  Chan 

'3' 

AT  linklout 

PLACE  Chan 

'4' 

AT  link2in 

PLACE  Chan 

'5' 

AT  link2out 

PLACE  Chan 

■6] 

AT  link3in 

PLACE  Chan 

:7J 

AT  link3out 

placing  the  procedure  to  be  executed  on  the  processor 

transmitter  (chan[0] ,  chan[2] ,  chan[4] ,  chan[6] , 
chan[l],  chan[3] ,  chan[5] ,  chan[7]) 

PROCESSOR  0 

PLACE  Chan 

0' 

AT  linkOout  : 

PLACE  Chan 

[i; 

AT  linkOin 

receiver  (chan[l] ,  chan[0]) 

PROCESSOR  1 

PLACE  Chan 

[2] 

AT  linkOout  : 

PLACE  Chan 

;3: 

AT  linkOin   : 

receiver  (chan[3] ,  chan[2]) 

PROCESSOR  2 

PLACE  chan[4' 
PLACE  Chan [5] 

AT  linkOout  : 

AT  linkOin   : 

receiver  (chan[5] ,  chan[4]) 

PROCESSOR  3 

PLACE  Chan 

■6' 

AT  linkOout  : 

PLACE  Chan 

;?: 

AT  linkOin   : 

receiver  (chan[7] ,  chan[6]) 

Figure  3.17     Configuration  Code  for  the  Link  Evaluation  Program. 

Using  the  program  described  above,  the  results  obtained  for  a  block  size  of 
1,500   bytes  were  : 

•  1  channel 3670  kbits/sec 

•  2  channels  in  parallel 3670  kbits/ sec(in  each  channel) 

•  3  channels  in  parallel 3650  kbits/sec(in  each  channel) 

•  4  channels  in  parallel 3630  kbits,'sec(in  each  channel) 

These  results  show  a  slight  decreasing  performance  as  more  channels  are  in 
parallel,  but  there  is  nearly  linear  improvement  in  communication  performance  due  to 
parallelism,  because  the  overall  data  transmission  jumped  from  3670  to  14520  {  4  x 
3630  )! 
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2.  Using  the  Oscilloscope 

As  had  happened  with  the  initial  observations  related  on  the  previous  section, 
the  programs  used  for  this  observation  where  adaptations  of  the  ones  just  presented 
using  the  WHILE  TRUE  construct  to  permit  continuous  transmission,  and  taking  off 
all  timing  and  fiags.  so  we  will  not  repeat  them  here.  As  we  know,  the  maximum  we 
could  monitor  at  one  time,  was  two  channels.  Two  observations  were  then  made: 

•  two  channels  of  different  links  transmitting  in  parallel  (Figure  3.18), 

•  two  channels  of  the  same  link  transmitting  in  parallel  (Figure  3.19). 


Osc 

it 

e  ( chanl ) 
e  ( chanS ) 
on 

1     ' .  1        ' 

;illoscope  Settings: 

channel  1  -->  snows  transmitted  fram 
channel  2  -->  shows  transmitted  fram 
time  scale  -->  0.  5  microsec  /  divisi 
voltage  scale  ->  2  volts  /  division 
vertical  mode  ->  ALT 
trigger  source->  channel  2( lower) 
storage  mode  used 

Figure  3.18    Two  Channels  of  Different  Links  Transmitting  at  the  Same  Time. 

In  the  first  case,  Figure  3.18  shows  the  sam.e  frame  used  on  the  previous 
section  (TRUE)  in  two  different  channels  and  one  may  notice  how  they  overlapped.  It 
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is  worth  emphasizing  that  the  interval  between  frames  is  due  to  the  acknowledge  delay 
explained  in  last  section  {not  shown  here),  and  the  different  phase  between  the  wave 
forms  is  due  only  to  the  communication  processes  had  begun  at  different  absolute 
times.  This  implies  that,  as  the  links  have  the  same  speed,  and  as  the  procedures  are 
the  same,  this  difference  of  phase  is  kept  constant  as  long  as  the  processes  are  running. 
This  photograph  was  taken  in  storage  mode  due  to  the  fact  that  in  normal  mode  the 
unsynchronism  between  the  channels  did  not  permit  us  to  see  both  waveforms  clearly. 


r  r              f 

.—  ,-  -J,        f^         ^ 

Osc; 

Llloscope  Settings: 

channel  1  -->  shows  transmitted  frame  (chanl) 
channel  2  -->  shows  received  frame  (chanO) 
time  scale  -->  0. 5  microsec  /  division 
voltage  scale  ->  2  volts  /  division 
vertical  mode  ->  ALT 
trigger  source->  channel  2( lower) 
storage  mode  used 

Figure  3.19     Two  Channels  of  the  Same  Link  Operating  at  the  Same  Time. 

In  the  second  case,  Figure  3.19  shows  two  channels  of  a  same  link  operating 
at  the  same  time.  This  picture  was  also  taken  in  storage  mode  for  the  same  reasons  just 
mentioned.  Note  that  at  this  time  we  can  observe  the  acknowledges  piggy-backed  on 
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the  transmitted  frames.  The  ACK  on  the  upper  trace  are  sent  for  the  frame  been  sho\vn 
on  the  lower  trace,  just  a  little  before  in  time.  The  reverse  is  valid  for  the  lower  trace' 
ACK.  Note  that  at  the  time  the  picture  was  taken,  all  eight  channels  were  operating  in 
the  same  way. 

3.  Using  the  Logic  Analyzer 

As  mentioned  in  Chapter  III,  a  snapshot  of  parallel  operations  is  not  easy  to 
get.  Our  Model  532  Logic  State  Analizer  could  store  up  to  250  words  of  32  bits  each 
monitored  by  2  Logic  Probes  of  16  bits  each.  We  used  only  one  of  the  probes  once 
monitormg  A  channels,  and  another  time  8  channels.  In  the  first  case,  monitoring  4 
channels,  three  of  them  were  carrying  data  frames  transmitted  by  dilTerent  links  {  the 
handy  "TRL'L"s),  and  the  last  one  carrying  the  acknowledge  of  the  fourth  link.  Figure 
3.20  is  a  reduction  of  the  printout  obtained  from  a  representative  part  of  the  250 
words.  Remember  that  the  links  are  serial  communication  devices  and  the  probes  are 
more  eflective  when  monitoring  parallel  buses,  specially  if  there  is  a  clock  available  on 
the  bus  (synchronous  buses),  and  so  the  sequence  of  "ones"  appear  vertically  on  the 
picture.  The  program  being  used  was  the  EVALCONTTRUE.tds,  using  the  same 
configuration  depicted  on  F^igure  3.14,  and  the  channels  monitored  were  chanl,  chan3, 
chan5  and  chan6.  The  program  was  transmitting  continuously  blocks  of  15000  bytes  of 
trues  by  the  four  output  channels  (1,3,5,7)  of  transputer  root.  The  first  column  is  the 
memory  position  of  the  logic  state  analyzer.  In  the  second  column  bits  4  and  8  (from 
left  to  right)  carry  respectively  chanl  and  chan3.  The  third  column  bits  4  and  8  again 
refer  to  chanb  and  chan5,  respectively. 

Figure  3.21  as  well  shows  us  one  representative  section  of  the  samples 
collected  by  the  logic  analyzer  from  eight  channels  distributed  as  Table  7  shows. 

The  "one "  bits  that  appear  in  the  other  columns  are  probably  cross-talk  due  to 
the  probe  being  made  of  parallel  wires,  while  the  INMOS  Hnks  are  always  in  twisted 
pairs.  It  is  also  good  to  reinforce  that  when  the  links  arc  transmitting  and  receiving  in 
parallel,  the  acknowledge  appears  piggy-backed,  as  Figure  3.19  shows,  and  in  the 
sequence  of  bits  wc  can  notice  them  very  clearly  in  several  spots. 

Although,  by  the  evidence  from  both  the  logic  analizer  and  the  oscilloscope, 
we  are  sure  that  the  channels  indeed  operate  in  parallel,  it  would  be  more  satisfying  to 
obtain  data  which  more  closely  coincides  with  the  measurements  taken  by  software 
experiments.  Our  suggestion  is  that  monitoring  the  channels  with  a  logic  state  analyzer 
which    can    handle    a    faster    clock,    would    enable    a    more    exact    measurement    of 
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Output  from  the  Logic  Analyzer  of  4  Channels  in  Parallel. 
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Figure  3.21     8  Channels  Monitored  with  the  Logic  Analizer. 
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TABLE  7 

LINK  MAP  FOR  FIGURE  3 

21 

channel 

column 

bit 

probe  lid 

chanO > 

second 

3 

D13 

chanl > 

second 

4 

D12 

chan2 > 

second 

7 

D9 

chan3 > 

second 

8 

D8 

chan4 > 

third 

3 

D5 

chanS > 

third 

4 

D4 

chan6 > 

third 

7 

Dl 

chan7 > 

third 

8 

DO 

acknowledge  delays  and  the  delays  between  successive  word,  and  byte  transmissions,  by 
making  timing  diagrams  of  4  and  8  channels  in  parallel.  This,  however,  is  left  as  a 
suesestion  for  future  research. 


Conclusion  3 
The  Links  really  are  able  to  operate  in  parallel! 


4.  Comparison  Between  the  Four  Constructs 


TABLE  8 

EFFECT  OF  PARALLELISM  ON  TRANSFER  RATES  FOR  10000  BYTES 

BLOCK  SIZE  ** 


input/c 
primit 
bytes 

jutput 
:ives 
words 

BYTE  SLICE 
procedure 

WORD  SLICE 
procedure 

1  channel 

370 

1510 

3670 

3670 

2  channels 

190 

770 

3670 

3670 

3  channels 

160 

640 

3650 

3650 

4  channels 

160 

640 

3630 

3620 

**  Values  are  in  kbytes/sec  rounded  to  tenths. 
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Table  S  shows  the  results  obtained  for  2,  3  and  4  links  transmitting  in  parallel 
for  each  of  the  constructs. 

These  results  where  obtained  using  the  Link  Evaluation  Program  for  all  the 
constructs,  listed  on  Appendix  E,  with  no  special  priority  for  communications,  and 
with  no  other  processes  being  executed  on  the  cpus,  besides  the  transmitter  and 
receiver  processes.  The  time  measurements  were  made  at  the  transputer  root  at  the 
BOO!  board.  It  is  clear  for  us  that  although  the  BYTE. SLICE  and  WORD. SLICE 
procedures  are  not  affected  for  more  channels  in  parallel  for  this  block  size,  the  input 
and  output  primitives  indeed  are,  but  this  will  be  addressed  in  the  next  section.  It  is 
still  worthy  of  mention  that  several  attempts  were  made  to  increase  the  transfer  rate  of 
the  primitives  input  and  output  by  using  different  loop  sizes,  no  loops  at  all.  different 
number  of  bytes,  or  words  after  each  ?  or  !  separated  by  colons  but  in  none  of  these 
cases  a  significant  improvement  was  noticed. 

D.       MESSAGE  SIZE  AND  CHANNEL  PARALLELISM  INFLUENCE. 

Once  we  overcame  the  first  phase  of  the  research,  validating  the  software  we  were 
using,  we  moved  our  attention  towards  the  fourth  and  fifth  research  questions: 

•  What  is  the  effect  of  message  lengths  on  the  link  transfer  rates? 

•  What  is  the  mutual  effect,  on  the  Unk  transfer  rates,  of  more  links  operating  in 
parallel. 

To  address  these  topics,  The  Link  Evaluation  Program  was  designed  and 
implemented,  using  the  programming  concepts  presented  on  previous  sections  of  this 
chapter.  What  it  does  basically  is,  after  the  user's  choice  of  type  of  construct  and 
existence  or  non  of  concurrent  process  running  on  the  CPU  of  the  communicating 
transputers,  named  "cpumode",  it  builds  a  Table  showing  the  transfer  rates  for  the  16 
different  message  sizes  and  9  different  channel  parallelism  cases,  for  the  chosen  option, 
and  prompts  the  user  for  a  new  run.  Appendix  H  presents  the  program,  written  in 
Occam,  but  one  doesn't  need  to  understand  the  program  to  grasp  the  results  obtained, 
that  will  be  presented  in  the  following  subsections,  and  in  the  next  chapter.  The 
configuration  used  for  this  program  was  the  same  of  Figures  3.14  and  3.17  . 

1.  How  to  Read  the  Tables 

The  tables  have  ten  (10)  columns  as  follows: 

•  BYTES  -  Shows  the  number  of  bytes  transmitted  for  the  results  obtained  in  that 
row. 

•  1  OUT  -  Results  obtained  measuring  transmission  through  only  one  channel 
from  root  to  transputer  0. 
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1  IN  -  The  same  as  above  for  reception  on  the  root  from  transputer  0 

2  OUT  -  Results  obtained  measuring  transmission  in  parallel  through  two 
channels  from  root  to  transputer  0  and  transputer  1. 

2  IN  -  Same  as  above  for  reception  in  parallel. 

3  OUT  -  Results  obtained  measuring  transmission  in  parallel  from  root  to 
transputers  0.  1  and  2. 

3  IN  -  Same  as  above  for  reception  in  parallel. 

4  OUT  -  Results  obtained  measuring  parallel  transmission  from  root  to 
transputers  0,  1,  2  and  3. 

4  IN  -  Same  as  above  for  reception  in  parallel. 

4  IN/OUT  -  Results  obtained  measuring  transmission  and  reception  in  parallel 
to/from  transputers  0,  1,  2  and  3,  using  all  8  channels  from  the  four  links  that 
exist  in  one  transputer. 


TABLE  9 

TRANSPUTER  LINK  TRANSFER  RATE 

BYTE  SLICE  (1)- 

NO  CONCURRENT  PROCESS 

-  10  MBITS/SEC 

BYTES 

1  OUT 

1  IN 

2  OUT 

2  IN   3  OUT 

3  IN 

4  OUT 

4  IN 

4IN0UT 

1 

625 

616 

250 

250    200 

198 

161 

161 

98 

2 

1217 

1237 

500 

500    400 

400 

325 

333 

196 

4 

1531 

2130 

779 

1000    648 

788 

650 

646 

384 

8 

2183 

2811 

1570 

1582    1311 

1301 

1085 

1096 

690 

16 

2758 

2924 

2101 

2222   1948 

1919 

1702 

1694 

1255 

32 

3224 

3246 

2589 

2300    2482 

2544 

2330 

2398 

1835 

64 

3427 

3646 

3116 

3226    2942 

3048 

2817 

2954 

2462 

128 

3543 

3644 

3332 

3497    3265 

3390 

3187 

3320 

2945 

256 

3605 

3741 

3496 

3656    3444 

3596 

3398 

3558 

3231 

512 

3635 

3778 

3578 

3733    3555 

3697 

3509 

3677 

3401 

1024 

3650 

3754 

3627 

3741    3604 

3712 

3575 

3702 

3512 

1280 

3654 

3748 

3640 

3742    3611 

3713 

3587 

3698 

3529 

2048 

3658 

3740 

3652 

3738    3621 

3715 

3604 

3703 

3549 

4096 

3662 

3735 

3663 

3733    3634 

3720 

3618 

3709 

3573 

8192 

3665 

3732 

3668 

3732    3645 

3721 

3627 

3714 

3585 

10000 

3667 

3731 

3669 

3730    3647 

3721 

3623 

3717 

3591 

^  Values  in 

kbits/ 

sec 

Some  tables,  though,  have  three  different  columns  labeled: 

•  1  IN/OUT  -  instead  of  1  IN 

•  2  IN/OUT  -  instead  of  2  IN 

•  3  IN/OUT- instead  of 3  IN 
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In  this  columns,  as  the  reader  may  have  guessed  already,  the  results  presented 
refer  to  the  transputer  root  transmitting  and  receiving  at  the  same  time  through  the 
number  of  links  specified. 

For  each  of  the  constructs  the  results  are  presented  in  table  format  and  when 
necessar}-  a  graphical  representation  of  the  table.  Each  individual  number  on  the  tables 
is  the  average  of  20  sequential  runs.    The  results  are  in  kbits'sec  due  to  non  availability 
of  floating  point  for  Proto-Occam  and  our  need  for  precision. 
2.  BYTE  SLICE  Procedure 

Table  9  and  10  show  us  the  transfer  rates  obtained  for  this  procedure  with 
communication  being  the  only  process  being  executed  by  the  transputers  involved. 
Figure  3.22  is  a  graphical  representation  of  Table  9  . 


TABLE  10 

TR.^> 

sSPLTER  LINK  TRANSFER  RATE 

BYTE  SLICE  (2)  - 

NO  CONCURRENT  PROCESS  ■ 

•  10  MBITS/SEC 

BYTES 

1  OUT 

UNCUT 

2  OUT 

2IN0UT 

3  OUT 

3IN0UT 

4  OUT 

4  IN  4IN0UT 

1 

625 

248 

250 

166 

194 

125 

161 

166 

93 

2 

1250 

500 

500 

333 

400 

245 

322 

333 

196 

4 

1518 

793 

779 

651 

645 

487 

645 

648 

384 

3 

2201 

1585 

1567 

1099 

1318 

851 

1101 

1105 

689 

16 

2782 

2155 

2208 

1711 

1951 

1458 

1701 

1714 

1256 

32 

3227 

2636 

2702 

2326 

2503 

2133 

2314 

2379 

1837 

64 

3513 

3067 

3116 

2850 

2988 

2667 

2813 

2975 

2741 

128 

3573 

3350 

3368 

3204 

3305 

3075 

3186 

3320 

2926 

255 

3628 

3491 

3522 

3417 

3476 

3333 

3404 

3533 

3230 

512 

3663 

3574 

3601 

3532 

3578 

3433 

3516 

3662 

3404 

1024 

3684 

3617 

3651 

3537 

3632 

3563 

3531 

3633 

3437 

1230 

3687 

3624 

3657 

3591 

3640 

3584 

3595 

3685 

3510 

2048 

3692 

3636 

3672 

3597 

3653 

3603 

3617 

3639 

3543 

4096 

3694 

3648 

3683 

3616 

3671 

3623 

3624 

3692 

3576 

8192 

3698 

3655 

3690 

3619 

3678 

3632 

3632 

3694 

3596 

10000 

3699 

3657 

3692 

3614 

3679 

3638 

3629 

3696 

3597 

*  Values  in 

kbits/ 

sec 

From  Table  9  we  can  notice  the  overall  tendency  of  input  be  quicker  than 
output,  due  to  the  way  the  timers  are  started  and  stopped  by  the  flags.  The  flags  used 
in  this  and  all  following  tables  were  placed  from  the  B003  transputers  to  the  root. 
When  we  had  the  flags  inverted  the  values  had  a  tendency  to  be  bigger  for  the  output, 
so  one  may  disregard  the  difference.  For  this  reason  we  tried  to  show  most  of  times  the 
values  for  "in'out"  instead  for  "in".    Most  important,  however  is  the  effect  of  message 


61 


size  and  channel  parallelism  reducing  the  transfer  rates  sensibly  for  smaller  message 
sizes,  but  with  insignificant  effect  for  message  sizes  above  256  bytes. 
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Figure  3.22    Transputer  Link  Transfer  Rate 
Byte  Slice  -  No  Process  in  Parallel  -  10  mbits/scc. 


62 


3.  WORD  SLICE  Procedure 


TABLE  11 

TRANSPUTER  LINK  TR.AXSFER  RATE  - 

WORD  SLICE  - 

NO  CONCURRENT  PROCESS  - 

10  MBITS,  SEC 

BYTES 

1  OUT 

1  IM 

2  OUT 

2  IN  3 

OUT   3 

IN 

4  OUT  4 

IN 

4IN0UT 

4 

1287 

1368 

666 

811 

533 

625 

452 

512 

294 

8 

1910 

2513 

1333 

1330 

1063 

1061 

893 

390 

540 

16 

2530 

3025 

1985 

1956 

1667 

1682 

1454 

1466 

998 

32 

3033 

3377 

2541 

2588 

2234 

2300 

2077 

2230 

1632 

64 

3321 

3559 

2956 

3091 

2830 

2875 

2647 

2790 

2266 

128 

3491 

3679 

3294 

3406 

3156 

3278 

3063 

3213 

2783 

256 

3572 

3738 

3492 

3593 

3401 

3521 

3339 

3485 

3143 

512 

3617 

3771 

3571 

3707 

3530 

3662 

3479 

3634 

3373 

1024 

3644 

3754 

3624 

3735 

3539 

3694 

3557 

3679 

3436 

1280 

3543 

3739 

3638 

3736 

3594 

3699 

3570 

3680 

3516 

2048 

3655 

3740 

3643 

3734 

3613 

3708 

3595 

3694 

3546 

4096 

3662 

3733 

3659 

3731 

3638 

3714 

3609 

3705 

3566 

3192 

3664 

3730 

3669 

3730 

3642 

3717 

3621 

3711 

3536 

10000 

3666 

3730 

3669 

3729 

3645 

3713 

3622 

3714 

3583 

*  Va 

lues  in 

kbits/ 

sec 

As  seen  in  Table  11,  the  results  obtained  from  WORD  SLICE  are  very  similar 
to  the  ones  we  had  got  for  the  BYTE  SLICE  procedure,  so  if  the  reader  wants,  he  or 
she  may  use  the  same  Figure  3.22  to  have  a  better  feeling  of  what  these  numbers 
means.  All  comments  made  for  the  BYTE  SLICE  procedure  are  valid  also  for  WORD 
SLICE. 


Conclusion  4 

Message  size  lias  a  major  effect  reducing  the  transfer 

rate  for  block  transfers  (BYTE  SLICE  and  WORD  SLICE). 


4.  Input  and  Output  Primitives 

a.    Transmitting  and  Receiving  Bytes 

Table  12  shows  us  the  results  using  the  primitives  input  and  output  to 
transmit  and  receive  bytes.  As  we  can  see,  there  is  no  variation  as  the  number  of  bytes 
increase.  This  is  due  to  the  fact  that  each  byte  is  transmitted  individually  as  can  be 
seen  in  Figure  3.12  .  We  can  also  notice  that  there  is  a  significant  decrease  as  more 
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TABLE  12 

TR.ANSPUTER  LINK  TRANSFER  RATE  - 

INPUT/OUTPUT  (BYTES  1) 

-  NO  CONCURRENT  PROCESS 

- 

(10  MBITS;SEC) 

BYTES  1  OUT 

1  IN 

2  OUT 

2  IN    3  OUT 

3  IN 

4  OUT 

4  IN 

4IN0UT 

1      370 

547 

192 

227     156 

179 

156 

147 

89 

2      370 

436 

138 

229     153 

181 

156 

149 

89 

4      377 

492 

192 

232     156 

183 

157 

149 

89 

S      373 

480 

190 

231     155 

183 

156 

148 

89 

16     375 

510 

191 

231     155 

183 

157 

148 

39 

32     374 

511 

191 

231     155 

183 

157 

148 

89 

64     374 

504 

191 

231     155 

183 

157 

148 

89 

128    374 

506 

191 

231     155 

183 

157 

148 

89 

256    374 

505 

191 

231     155 

183 

157 

148 

89 

512     374 

506 

191 

231     155 

183 

157 

148 

89 

1024    374 

506 

191 

231     155 

183 

157 

148 

89 

1280    374 

506 

191 

231     155 

183 

157 

148 

89 

2043    374 

505 

191 

231     155 

183 

157 

148 

89 

4096    374 

510 

191 

231     155 

183 

157 

148 

39 

8192    374 

510 

191 

231     155 

183 

157 

148 

89 

10000   374 

510 

191 

231    155 

183 

157 

148 

89 

*  Values  in 

kbits/ 

sec 

channels  are  transmitting  in  parallel.  We  mention  again  that  we  tried  several  loop  sizes 
or  even  no  loop  at  all,  with  bytes  separated  by  semicolons,  but  the  results  we  have  got 
where  never  significantly  bigger  than  the  ones  presented.  Table  13  stress  the 
comparison  when  both  channels  of  a  same  link  are  operating  at  the  same  time, 
transmitting  and  receiving  messages.  Note  how  the  results  on  columns  3,  5,  and  7  of 
Table  12  are  50%  to  100%  bigger  than  the  ones  from  Table  13  . 
b.    Transmitting  and  Receiving  Words  {Integers) 

Table  14  shows  us  the  results  for  transmitting  integers  and  we  can  notice 
again  that  message  size  does  not  affect  the  transfer  rate,  but  more  channels  operating 
in  parallel  do.  As  we  should  expect  from  the  previous  results  presented,  this  rate  is,  on 
the  average,  4  times  larger  than  the  one  for  transmitting  bytes. 

Table  15  shows  the  comparison  when  both  channels  of  a  same  link  are 
transmitting  and  receiving  at  the  same  time.  Again  we  confirm  that,  in  terms  of  link 
performance,  worse  than  having  two  different  links  transmitting  at  the  same  time,  is  to 
have  the  same  link  transmitting  and  receiving. 
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TABLE  13 

TRANSPUTER  LINK  TRANSFER  Rj\JE  - 

INPUT,  OUTPUT  (BYTES 

2)   - 

NO  CONCURRENT  PROCESS 

- 

{10  MBITS/SEC) 

BYTES  1  OUT 

IINOUT 

2  OUT 

2IN0UT  3  OUT 

3IN0UT 

4  OUT 

4  IN 

4IN0UT 

1      370 

236 

189 

151 

156 

112 

156 

147 

76 

2      370 

232 

188 

149 

153 

112 

156 

149 

89 

4      370 

235 

192 

150 

157 

112 

157 

149 

89 

3      373 

235 

190 

150 

155 

113 

156 

148 

89 

16     373 

235 

190 

150 

155 

113 

156 

148 

89 

32     373 

235 

190 

150 

155 

113 

156 

143 

89 

64     374 

235 

191 

151 

155 

113 

156 

148 

89 

32     374 

235 

191 

151 

155 

113 

156 

148 

89 

64     374 

235 

191 

151 

155 

113 

157 

148 

89 

128    374 

235 

191 

150 

155 

113 

157 

148 

89 

256    374 

235 

191 

150 

155 

113 

157 

148 

89 

512     374 

235 

191 

151 

155 

113 

156 

143 

89 

1024    374 

235 

191 

151 

155 

113 

157 

148 

89 

1280    374 

235 

191 

151 

155 

113 

157 

148 

89 

2043   374 

235 

191 

151 

155 

113 

156 

148 

89 

4096    374 

235 

191 

151 

155 

113 

157 

148 

89 

8192    374 

235 

191 

150 

155 

113 

157 

148 

89 

10000   374 

235 

191 

150 

155 

113 

157 

148 

89 

*  Values  in 

kbits/ 

sec 

Conclusion  5 

More  channels  in  parallel  has  a  great  reducing  effect 

over  the  transfer  rate  for  all  constructs  except  block. 

transfers  (BYTE  and  WORD  SLICE),  bigger  than  256  bytes. 


This  conclusion  does  not  contradict  Conclusion  3,  but  reduces  the  universe 
in  which  that  is  applicable. 
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TABLE  14 

TRANSPUTER  LINK  TRANSFER  RATE  - 

INPUT/OUTPUT  (WORDS  1)  -  NO  CONCURRENT  PROCESS 

- 

{10  MBITS/SEC) 

BYTES  1  CUT   1  IN 

2  OUT 

2  IN 

3  OUT 

3  IN 

4  OUT 

4  IN 

4IN0UT 

4      1526    2330 

769 

1000 

643 

785 

628 

640 

377 

8      1491    2369 

763 

1000 

634 

770 

631 

634 

375 

16      1484    2290 

761 

1000 

635 

769 

629 

640 

375 

32     1509    2326 

765 

1000 

635 

769 

629 

640 

375 

64     1504    2321 

767 

1003 

640 

772 

635 

642 

376 

128     1505    2367 

766 

1003 

640 

772 

635 

642 

377 

256     1509    2366 

767 

1004 

641 

772 

635 

642 

376 

512     1508    2383 

767 

1004 

641 

773 

635 

642 

377 

1024    1509    2382 

767 

1004 

641 

773 

635 

642 

376 

1280    1508    2384 

767 

1004 

641 

774 

635 

642 

377 

2048    1509    2384 

767 

1004 

641 

774 

635 

642 

377 

4096    1511    2396 

767 

1004 

641 

774 

635 

642 

377 

8192    1510    2394 

767 

1005 

641 

774 

636 

643 

377 

10000   1509    2394 

767 

1005 

641 

774 

636 

643 

377 

'^   Values  in  kbits/ 

sec 

TABLE  15 

TR.^NSPUTER  LINK  TRANSFER  RATE  - 

INPUT/OUTPUT  (WORDS  2)   - 

NO  CONCURRENT  PROCESS 

- 

(10  MBITS/SEC) 

BYTES  1  OUT 

IINOUT 

2  OUT 

2IN0UT  3  OUT 

3 INCUT 

4  OUT 

4  IN 

4IN0UT 

4      1428 

959 

769 

645 

638 

476 

625 

645 

377 

8      1431 

959 

769 

634 

634 

470 

621 

634 

375 

16     1495 

963 

761 

637 

636 

470 

620 

634 

376 

32     1509 

969 

765 

640 

640 

471 

622 

637 

376 

64     1506 

969 

767 

640 

640 

472 

623 

637 

376 

128     1505 

969 

767 

640 

640 

472 

623 

637 

376 

256     1509 

969 

767 

640 

641 

473 

624 

638 

376 

512     1509 

970 

767 

641 

641 

473 

624 

638 

377 

1024    1509 

970 

767 

641 

641 

473 

624 

638 

377 

1280    1509 

970 

767 

641 

641 

473 

624 

637 

377 

2048    1510 

970 

767 

641 

641 

473 

624 

638 

377 

4096    1510 

971 

767 

641 

641 

473 

624 

637 

377 

8192    1510 

971 

767 

641 

641 

473 

624 

638 

377 

10000   1510 

971 

767 

641 

641 

473 

624 

637 

377 

*  Values  in 

kbits/ 

sec 
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IV.  THE  MUTUAL  EFFECTS  BETWEEN  PROCESSOR  AND  THE  FOUR 

LINKS 


When  using  the  transputer  in  a  multi-transputer  configuration,  most  likely  it  will 
be  necessary  in  each  transputer  node,  at  least  one  process  to  route  messages  between 
transputers,  and  another  to  execute  some  processing  task.  Our  role  in  this  chapter  is  to 
examine  how  a  process  task  oriented  would  affect  a  routing  process,  changing  the 
transfer  rate  on  the  links.  Also,  we  are  going  to  analyze  how  a  routing  process  handling 
large  messages  may  affect  the  throughput  of  a  computation  bound  process. 

A.       EFFECT  OF  CONCURRENT  PROCESSES  OVER  COMMUNICATIONS 

I.  Initial  Considerations 

This  section  addresses  the  sixth  and  seventh  research  questions  as  follows: 

•  Can  the  CPU  execute  a  process  in  parallel  with  some  or  all  the  links  operating? 

•  What  is  the  effect  of  a  communication  independent  process  running  on  the 
CPU,  over  the  transfer  rates  obtained  in  a  link  by  another  process  in  this 
transputer? 

To  observe  this  effect  with  the  links  working  at  10  mbits/sec  rate,  the  same 

Evaluation  Program  was  used,  but  using  different  program  defined  cpu  modes.    Figure 

4.1  shows  the  CPU  modes  made  available  by  the  program  to  the  user's  choice. 


0  -  No  concurrent  process   in  the  cpus 

1  -   B003   cpus  with  sum  process   concurrently   (par) 

2  -   all  cpus  with  sum  process   concurrently   (par) 

3  -   BOOS   cpus  with   sum  process   concurrently    (pripar) 

4  -   all  cpus  with  sum  process   concurrently   (pripar) 

5  -   3003   cpus  with  array  product  process   concurrently   (par) 

6  -   all  cpus  v;ith  array  product  process   concurrently    (par) 

7  -   B003   cpus  with  array  product  process   concurrently(pripar 

8  -   all  cpus  with  array  product  process   concurrently   (pripar 


Figure  4.1     CPU  modes  Available  in  the  Link  Evaluation  Program. 
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Tw^o  procedures  called  "cpubusysum"  or  "cpubusyprod"  would  be  -running 
concurrently  with  the  transmitter  and  receiver  procedures  in  one  or  both 
communicating  CPUs  according  to  the  CPU  mode  chosen  and  with  the  following 
effects: 
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•  "cpubusysum"  -  This  procedure  would  initiate  at  the  start  of  communications 
and  execute  sum  operations  continuously,  until  the  communications  were 
finished,  with  few  memor\'  accesses  involved. 

•  "cpubusyprod"  -  This  procedure,  equally,  would  initiate  at  the  start  of 
communications  and  execute  array  products  continuously  until  communications 
were  finished.  Now  100  times  more  memor>'  accesses  was  necessary. 

Figure  4.2  shows  the  code  to  perinit  this  (e.  g.    transmission),  for  a  WORD 

SLICE   construct.   Similar  code  exists   for  the   other  constructs,   only  changing  the 

procedure  "wordtransfer"  to  the  applicable  one.  See  Appendix  E  for  more  details  on 

them. 


SEQ   main  word. slice .transfer 

--  word  buffers   initialization 
SEQ  k  =  [1  FOR  maxwordblock. size] 
SEQ 

■  "  -  ^k 
'k 
'k 
'k 


=  10000 
=  20000 
=  30000 
=  40000 


wbufferO 
wbufferl 
wbuf fer2 
v;buffer3 
SKIP 

IF 

cpumode  =  ' 2 ' 
PAR 

wordtransfer  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
cpumod.e  =  '  4 ' 
PRI  PAR 

v/ordtransfer  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
cpumode  =  '  6  ' 
PAR 

v7ordtransfer  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
cpumode  =  '8' 
PRI  PAR 

wordtransfer  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
TRUE 

wordtransfer  (repetition,  cpumode,  flag,  counter): 


Figure  4.2     How  the  Concurrent  Processes  Were  Called. 

2.  Process  Priority  Considerations 

The  transputer  supports  two  priority  levels  built  in  in  hardware: 

Priority  0  (High)  -  processes  with  this  priority  are  executed  always,  without 
being  interrupted  until  they  finish.  They  should  work  only  for  a  short  period  of 
time  because  if  the  sum  of  time  spent  by  all  priority  processes  is  greater  than  a 
time  slice,  the  low  priority  processes  will  not  be  able  to  proceed  [Ref.  7:  section 
2,  p. 3].   The  high  priority  processes  preempt  the  low  priority  ones. 
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2  Priority  1  (Low)  -  These  are  executed  when  no  more  high  priority  processes  are 
able  to  proceed,  in  a  time  slice  fashion  of  1  msec  for  each  process. 

In  our  program,  considering  the  two  processes  to  be  executed  in  the  same 

CPU,  three  situations  were  examined.  Assuming  the  processes  names  are  for  example 

"communication"  and  "cpubusy"  we  have: 

a.  both  processes  under  a  PAR  construct  -  m  this  case  processes  will  be  time 
sliced,  because  both  are  low  priority,  at  ever>-  1  msec.  This  case  was  observed 
by  using  "cpumode"  1,  2,  5  and  6  in  the  Link  Evaluation  Program. 

b.  both  processes  under  a  PRI  PAR  with  communications  in  high  priority  -  in  this 
case  communication  will  always  be  executed  at  once.  Remember  that  it  took 
31.5  msec  for  a  15,000  bytes  block  to  be  transmitted,  and  the  time  slice  is  1 
msec,  and  so  the  cpubusy  will  not  have  a  chance  to  be  executed  if 
communication  is  going  on,  unless  the  number  of  bytes  transinitted  is  smaller 
than  4~5  : 

•  (475  X  S)  '  3.800,000  =  0.001  sec  or  1  msec,  if  we  considered  a  rate  of  3.8 
mbits,sec.  This  cases  were  observed  by  using  "cpumode"  3,4,7  and  8  in  the 
Link  Evaluation  Program. 

c.  both  processes  under  PRI  PAR  but  with  the  cpubusy  process  in  high  -  in  this 
case  the  communications  never  occurred  because  the  "cpubusy"  process 
although  started  together  with  the  communications,  should  be  stopped  by  a  flag 
of  that  process  (by  design),  that  could  never  come,  because  the  process  was 
never  being  granted  CPU  time.  This  is  why  no  mention  to  this  case  is  made  on 
the  Link  Evaluation  Program. 

Again,   analyzes   were   made   for   the   four   constructs   and   the    results   are 

presented  in  tables  and  graphics  similar  to  the  ones  in  the  previous  chapter. 

Another  point  to  mention  is  that  when  placing  a  concurrent  process  in  only 

one  of  the  communicating  CPUs,  the  B003  transputer  was  the  chosen  one,  because  of 

its  higher  internal  clock.  When  the  BOOl  transputer  was  with  the  "cpubusy"  process 

first,  no  changes  were  noticed  in  the  transfer  rates  as  we  added  a  cpubusy  process  on 

the  B003  transputers.  In  the  way  we  did,  we  could  clearly  see  the  two  step  change. 

3.  BYTE  SLICE  Procedure 

a.    Using  the  PAR  Construct 

(1)    One  Transputer  Only  (cpumode  =   I  or  5). 

In  this  case  Table  16  for  "cpumode  =   1"  and  Table  17  for  "cpumode 

=  5",  shows  us  the  results,  and  Figure  4.3  is  the  graphical  representation  of  Table  16  . 

We  can  observe  that  when  the  CPU  has  a  concurrent  process  running 

with  the  same  priority  as  the  communications  process,  the  transfer  rate  is  reduced  from 

10%  to  99.5%  less  of  the  orisinal  values  on  Table  9  . 
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TABLE  16 

TR.'\NSPUTER  LINK  TRANSFER  RATE  -  BYTE  SLICE  - 

PROCEDURE  CPUBUSYSUM  CONCURRENT  . 

\T   THE 

B003 

- 

lOMBITS/SEC 

BYTES 

1  OUT 

1 INCUT 

2  OUT 

2IN0UT 

3  OUT 

3 INCUT 

4  CUT 

4  IN 

4IN0UT 

1 

3 

1 

3 

1 

3 

1 

3 

3 

1 

2 

7 

3 

7 

3 

7 

3 

7 

7 

3 

4 

15 

7 

15 

7 

15 

7 

15 

15 

7 

8 

31 

15 

31 

15 

31 

15 

31 

30 

15 

16 

61 

30 

61 

30 

61 

30 

61 

61 

30 

32 

128 

61 

121 

61 

120 

61 

120 

120 

61 

64 

253 

121 

235 

120 

234 

120 

234 

234 

121 

128 

597 

234 

441 

234 

442 

234 

442 

445 

234 

256 

739 

441 

792 

441 

804 

441 

801 

811 

440 

512 

1311 

813 

1326 

788 

1348 

788 

1337 

1320 

786 

1024 

2010 

1315 

1954 

1317 

1969 

1315 

1957 

1994 

1313 

1280 

2204 

1518 

2181 

1494 

2123 

1489 

2121 

2142 

1489 

2048 

2546 

1938 

2561 

1916 

2552 

1934 

2550 

2588 

1929 

4096 

3013 

2535 

3030 

2535 

2999 

2527 

2979 

3017 

2517 

8192 

3324 

3004 

3316 

2976 

3312 

2967 

3280 

3326 

2955 

10000 

3386 

3100 

3380 

3083 

3370 

3065 

3332 

3385 

3051 

*  Values  in 

kbits/ 

sec . 

TABLE  17 

TRANSPUTER  LINK  TRANSFER  RATE  -  BYTE  SLICE  - 

PROCEDURE  CPUBUSYPROD  CONCURRENT 

AT  THE  B003 

- 

lOMBITS/SEC 

BYTES 

1  OUT 

1 INCUT 

2  CUT 

2 INCUT 

3  CUT 

3 INCUT 

4  CUT 

4  IN 

4INCUT 

1 

3 

1 

3 

1 

3 

1 

3 

3 

1 

2 

7 

3 

7 

3 

7 

3 

7 

7 

3 

4 

15 

7 

15 

7 

15 

7 

15 

15 

7 

8 

31 

15 

31 

15 

31 

15 

31 

30 

15 

16 

61 

30 

61 

30 

61 

30 

61 

61 

30 

32 

128 

61 

121 

61 

120 

61 

120 

121 

61 

64 

253 

121 

235 

120 

234 

120 

234 

235 

121 

128 

479 

234 

443 

234 

443 

234 

445 

444 

234 

256 

854 

441 

805 

441 

804 

441 

804 

810 

441 

512 

1402 

818 

1349 

790 

1338 

789 

1339 

1327 

788 

1024 

1975 

1316 

1973 

1318 

1971 

1318 

1924 

1971 

1295 

1280 

2151 

1492 

2132 

1496 

2123 

1490 

2110 

2144 

1490 

2048 

2568 

1938 

2573 

1938 

2526 

1929 

2515 

2544 

1923 

4096 

3036 

2539 

3006 

2529 

2999 

2510 

2982 

3020 

2498 

8192 

3332 

2989 

3320 

2989 

3311 

2965 

3281 

3355 

2964 

10000 

3396 

3091 

3332 

3076 

3374 

3068 

3343 

3417 

3060 

^  Va 

lues  in 

kbits/ 

sec. 
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Figure  4.3    Transputer  Link  Transfer  Rate  -  BYTE  SLICE 
Procedure  Cpubusysum  Concurrent  at  the  B003   -    10  mbits/sec. 
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This  is  a  great  surprise  for  us  because  we  are  only  timing  the 
communication  itself  and  although  we  can  not  prove,  it  looks  like  the  communication 
process  is  alive  and  sharing  CPU  time  with  the  cpubusy  process,  instead  of  being 
inactive  while  the  links  communicate,  as  all  the  references  led  us  and  our  predecessors 
to  believe  [Ref  5:  p.  16]. 

(2)    Both  Transputers  Busy  {cpumode=  2  or  6). 


TABLE  18 

TRANSPUTER  LINK  TRANSFER  RATE  -  BYTE  SLICE  - 

PROCEDURE  CPUBUSYSUM  CONCURRENT  AT  ALL 

CPUS 

- 

lOMBIT 

SSEC 

BYTES 

1  OUT 

IINOUT 

2  OUT 

2IN0UT 

3  OUT 

3IN0UT 

4  OUT 

4  IN 

4IN0UT 

1 

2 

1 

1 

1 

1 

1 

1 

1 

1 

2 

4 

2 

2 

2 

2 

2 

2 

2 

2 

A 

9 

4 

4 

4 

4 

4 

4 

4 

4 

8 

19 

9 

9 

9 

9 

9 

9 

9 

9 

16 

39 

19 

19 

19 

19 

19 

19 

19 

19 

32 

78 

39 

39 

39 

39 

39 

39 

39 

33 

64 

156 

78 

78 

73 

78 

77 

77 

78 

77 

128 

312 

156 

156 

156 

156 

156 

156 

156 

155 

256 

624 

312 

312 

312 

312 

312 

312 

312 

311 

512 

1249 

624 

624 

624 

624 

624 

624 

624 

623 

1024 

2498 

1248 

1249 

1248 

1248 

1248 

1249 

1249 

1247 

1280 

3120 

1560 

1560 

1561 

1561 

1561 

1561 

1561 

1559 

2048 

2498 

1665 

1665 

1665 

1666 

1666 

1665 

1665 

1664 

4096 

3332 

2498 

2499 

2499 

2499 

2499 

2499 

2499 

2497 

8192 

3332 

2355 

2856 

2856 

2856 

2856 

2856 

2856 

2855 

10000 

3487 

3050 

3050 

3050 

3050 

3050 

3050 

3050 

3049 

^   Values  in 

kbits/ 

sec. 

Table  18  and  Figure  4.4  need  no  explanation.  The  results  for 
"cpubusyprod"  are  not  presented  because  they  happen  to  give  us  exactly  the  same 
results  for  "cpubusysum",  as  we  saw  in  the  previous  subsection. 

One  may  notice  in  Table  18  column  "1  OUT",  that  the  value  for  2048 
bytes  (2498)  is  a  lot  smaller  than  the  previous  one  (3120),  and  the  effect  is  clearly  seen 
in  Figure  4.4  .  What  may  be  happening  is  that  as  the  buffer  declared  on  the  program 
(bufferO)  may  have  the  initial  bytes  of  it  in  the  internal  memory  of  the  transputers  (2 
kbytes),  and  when  external  memory  begins  to  be  accessed,  the  transfer  rate  goes  down, 
or  reduce  the  rate  of  increase,  as  we  can  see  on  the  lower  curve  of  figure  4.4,  where  ail 
the  remaining  curves  coincide  and  have  a  brake  on  the  rate  of  increase  at  the  same 
point.  This  is  what  the  author  thinks  is  happening  but  we  were  not  able  to  prove  it. 
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Figure  4.4    Transputer  Link  Transfer  Rate  -  BYTE  SLICE 
Procedure   Cpubusysum  Concurrent  at  All   CPUs   -    10  mbits/sec. 
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Using  the  PRI  PAR  Construct 


TABLE  19 

TR.\NSPUTER  LINK  TRANSFER  R.^TE  -  BYTE  SLICE  - 

PROCEDURE  CPUBUSYSUM  CONCURRENT  AT  THE  B003  (HIGH)  - 

10  MBITS/SEC 

BYTES 

1  OUT 

IINOUT 

2  OUT 

2IN0UT 

3  OUT 

3IN0UT 

4  OUT 

4  IN 

4IN0UT 

1 

625 

248 

250 

166 

194 

125 

161 

166 

98 

2 

1250 

500 

500 

333 

400 

245 

322 

333 

196 

4 

1518 

793 

779 

651 

645 

487 

645 

648 

384 

8 

2201 

1585 

1567 

1099 

1318 

851 

1101 

1105 

689 

16 

2782 

2155 

2208 

1711 

1951 

1458 

1701 

1714 

1256 

32 

3227 

2636 

2702 

2326 

2503 

2133 

2314 

2379 

1837 

64 

3513 

3067 

3116 

2850 

2988 

2667 

2813 

2975 

2471 

128 

3578 

3350 

3368 

3204 

3305 

3075 

3186 

3320 

2926 

256 

3628 

3491 

3522 

3417 

3476 

3333 

3404 

3538 

3230 

512 

3663 

3574 

3601 

3532 

3578 

3483 

3516 

3662 

3404 

1024 

3684 

3617 

3651 

3537 

3632 

3568 

3581 

3683 

3487 

1280 

3687 

3624 

3657 

3591 

3640 

3584 

3595 

3685 

3510 

2048 

3692 

3636 

3672 

3597 

3653 

3603 

3617 

3689 

3543 

4096 

3694 

3648 

3683 

3616 

3671 

3623 

3624 

3692 

3576 

8192 

3698 

3655 

3690 

3619 

3678 

3632 

3632 

3694 

3596 

10000 

3699 

3657 

3692 

3614 

3679 

3638 

3629 

3696 

3597 

^   Values  in 

kbits/ 

sec. 

Table  19  and  and  Figure  4.5  show  the  results  for  one  concurrent  process 
running  in  the  B003  transputers,  and  Table  20  and  Figure  4.6  the  same  for  all  CPUs 
with  concurrent  process  but  in  all  cases  communication  having  the  high  priority. 

As  we  see  the  figures  are  even  better,  on  the  average,  than  when  no  process 
was  running  concurrently,  as  seen  on  Table  9  .  This  is  why  we  believe  and  suggest  that 
processes  that  handle  only  communications,  as  the  routers,  should  be  given  always  high 
priority. 

For  each  of  the  possible  cases.  Table  21  shows  us  the  number  of  processes 
executed  in  parallel  in  each  transputer.  Although  they  do  not  have  a  valuable  absolute 
meaning,  they  give  us  a  comparative  value  of  the  behavior  of  the  CPU  in  the  different 
constructs.  The  reason  for  that  is  in  the  way  the  program  was  made.  There  are  some 
intervals  between  the  several  communication  sessions  and  repetitions,  were  the  cpubusy 
process  would  be  able  to  operate,  time  sliced  with  the  calculations  and  output  to 
screen,  done  after  each  of  these  sessions. 
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TABLE  20 

TR.ANSPUTER  LINK  TILANSFER  Ri\TE  -  BYTE  SLICE 

PROCEDURE  CPUBUSYSUM  CONCURRENT  AT  ALL  CPUS  (HIGH)  - 

10  MBITS/SEC 

BYTES 

1    OUT 

1 INCUT 

2    OUT 

2  INCUT 

3   OUT 

3INCUT 

4   CUT 

4   IN 

4INCUT 

1 

1132 

555 

500 

263 

359 

172 

263 

263 

127 

2 

1728 

931 

943 

526 

690 

353 

520 

526 

256 

4 

2288 

1540 

1436 

1036 

1204 

631 

977 

992 

498 

8 

2684 

2165 

2115 

1621 

1831 

1341 

1565 

1723 

986 

16 

3203 

2740 

2549 

2245 

2435 

1933 

2199 

2352 

1664 

32 

3459 

3136 

3102 

2792 

2939 

2526 

2730 

2912 

2265 

64 

3549 

3376 

3375 

3156 

3276 

2996 

3139 

3295 

2783 

123 

3615 

3511 

3533 

3399 

3468 

3303 

3384 

3517 

3150 

256 

3655 

3579 

3600 

3525 

3568 

3475 

3516 

3651 

3359 

512 

3576 

3621 

3631 

3589 

3623 

3565 

3577 

3720 

3491 

1024 

3633 

3641 

3665 

3623 

3644 

3619 

3613 

3714 

3542 

1280 

3682 

3646 

3663 

3630 

3651 

3626 

3623 

3707 

3553 

2048 

3683 

3651 

3675 

3641 

3659 

3541 

3634 

3702 

3579 

4096 

3690 

3657 

3675 

3650 

3663 

3649 

3642 

3701 

3595 

3192 

3639 

3661 

3675 

3655 

3673 

3553 

3646 

3698 

3610 

10000 

3639 

3663 

3679 

3555 

3673 

3654 

3647 

3698 

3614 

'^  Values   in 

kbits/ 

sec. 

TABLE  21 

NUMBER  OF  OPERATIONS  EXECUTED  CONCURRENTLY  IN  EACH 

CPU*-  BYTE  SLICE  USED 


Transput 
cpu. sum 

:er  b003 
cpu. prod 

Transput€ 
cpu. sum 

ir  bOOl 
cpu. prod 

1    PAR 

2.  7 

6.  5 

inactive 

inactive 

2    PAR 

5.  0 

11.  9 

3.  9 

7.  8 

1    PRI    PAR 

1.  3 

3.  1 

inactive 

inactive 

2    PRI    PAR 

1.  3 

3.  1 

0.  9 

1.  8 

*  Values  are  in  millions. 


4.  WORD  SLICE  Procedure 

For  the  WORD  SLICE  Procedure,  it  happens  that  the  results  are  very  similar 
to  the  ones  obtained  for  the  BYTE  SLICE  Procedure  and  they  will  not  be  repeated 
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Figure  4.5     Transputer  Link  Transfer  Rate  -  BYTE  SLICE 
Procedure  Cpubusysum  Concurrent  at  tiie  B003(high)  -  10  mbits/sec. 
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Figure  4.6    Transputer  Link  Transfer  Rate  -  BYTE  SLICE 
Procedure  Cpubusysum  Concurrent  at  All  CPUs(high)  -  10  mbits/sec. 
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here.  The  reader  may  refer  to  all  Tables  and  Figures  described  in  the  last  section,  just 
remembering  that  for  word  transfer  the  minimum  number  of  bytes  is  4,  and  so,  the  two 
first  rows  might  be  disregarded. 

5.  Input  and  Output  Primitives 

a.    Transmitting  and  Receiving  Bytes 


TABLE  22 

TIL-XXSPUTER  LINK  TR.^NSFER  RATE-  -  INPUT/OUTPUT  (BYTES) 
PROC  CPLBUSYSUM  CONCURRENT  -  10  MBITS/SEC 

cpumode     1   out 

lin/out 

2   out  2in/out 

3   out  3in/out 

4  out  4in/out 

1   PAR                3 

1 

3              1 

3              1 

3              1 

2  PAR                2 

1 

1              1 

1              1 

1              1 

1   PRIPAR     370 

230 

190          150 

160           110 

155            90 

2   PRIPAR      575          350 
*  Values   in  kbits/sec 

370          225 

295          155 

235          115 

Following  a  tendency  observed  before,  there  were  no  variations  for  transfer 
rates  with  respect  to  the  message  size.  Table  22  shows  us  the  figures  obtained  for  the 
various  priority  schemes  used. 

These  results  were  the  same  for  the  procedure  "cpubusyprod",  and  for  this 
reason  are  not  shown. 

b.    Transmitting  and  Receiving  Integers 


TABLE  23 

■' 

TRANSPUTER  LINK  TRi\NSFER  R.^TE* 

-  INPUT/OUTPUT  (WORDS) 

PROC  CPUBUSY.SUM  CONCURRENT  -  10  MBITS/SEC 

cpumode     1   out 

lin/out 

2   out 

2in/out 

3   out  3in/out     4  out  4in/out 

1   PAR              15 

7 

15 

7 

15              7              15              7 

2  PAR                 9 

4 

4 

4 

4              4                 4              4 

1   PRIPAR   1510 

970 

765 

640 

640          470             625           375 

2   PRIPAR  2345        1560 
^  Values   in  kbits/sec 

1450 

1040 

1200          650            930          480 
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Table  23  shows  us  the  results  for  transmitting  and  receiving  integers  with 
input  and  output  primitives. 

Several  conclusions  may  be  drawn  from  the  two  tables  mentioned  above: 

•  results  for  integers  are  in  general  four  times  larger  than  for  bytes. 

•  A  process  running  concurrently  does  affect  the  communications  if  under  a  PAR 
construct.  Results  are  50  to  100  times  smaller  than  the  ones  obtained  for  no 
concurrent  process  using  the  CPU.  See  Table  13  . 

•  When  running  communications  under  PRI  PAR  on  the  B003  transputers,  same 
results  are  obtained  as  with  no  other  concurrent  process.  One  shall  compare 
third  row  of  Table  22  (1  in;out),  with  Table  13  for  bytes  and  third  row  of  Table 
23  with  Table  15  for  integers. 

•  When  running  communications  in  PRI  PAR  in  both  transputers  the  best 
transfer  rates  are  obtained  either  for  bytes  or  integers.  So  the  concurrent  CPU 
process  will  not  affect  the  communications. 

It  is  always  good  to  remember  that  the  cpu  load  cases  examined  are 

extreme  cases  that  rarely  or  never  will  occur  in  any  application  program,  but  the  results 

obtained,  undoubtedly,  show  us  a  relation  between  cpu  load  and  performance  obtained 

on  the  links.  So,  referring  back  to  research  question  6,  we  are  not  able  to  affirm  now  if 

the  links  can  operate  in  parallel  with  the  processor,  but  next  section  will  address  this 

point  again. 


Conclusion  6 

Under  a  PAR  construct,  a  process  working  concurrently  on 

the  CPU,  will  reduce  the  transfer  rate  on  the  links. 


Under  the  PRI  PAR,  it  looks  like  the  communication  process  in  high 
priority  does  not  suffer  any  dragging,  but  we  have  still  a  doubt  of  how  much  can  a 
process  do  when  the  communications  are  in  PRI  PAR  and  are  lengthy.  This  will  be 
addressed  in  the  next  section. 
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B.       THE     EFFECT    OF    THE    COMMUNICATIONS    OVER    CONCURRENT 
PROCESSES 

This  section  addresses  the  eighth  research  question  below: 

•      "What  is  the  effect  of  the  communications  on  the  Unks,  over  a  process  that  is 
being  executed  concurrently  on  the  main  processor  of  the  same  transputer?" 

1.  Initial  Considerations 

To  observe  this  we  needed  to  time  a  fixed  length  process  without  any 
communications  occurring  in  the  processor  in  which  it  was  being  executed,  and  time  it 
later  with  communications  in  parallel  through  the  links.  As  we  mentioned  before,  in  the 
latter  case  we  needed  to  make  sure  that  only  the  communications  were  happening 
concurrently,  hopefully  in  parallel,  in  order  to  guarantee  that  the  process  being  timed 
was  not  being  dragged  by  other  processes  besides  communication  processes. 


PROC  counter  (CHAN  in, out,  VALUE  tnumber)  = 
--  description 

Sums  up  the  first  100000  integers  and  add  the  transputer 

number  to  the  total 

DEF  maxope  =  100000:   number  of  operations  done 

VAR  ch, total  : 

VAR  startimeS,  endtime3 : 

SEQ 

total    :=   tnumber 

in  ?   ch 

TIME   ?   startimeS 

SEQ   i  =    [0   FOR  maxope] 

lotal    :=   total  +   i 
TIME   ?   endtimeS 
out    !    total, •  StartimeS, -endtimeS : 


Figure  4.7     Procedure  Counter. 

What  was  done,  then,  was  to  make  a  simple  procedure  called  "counter"  listed 
on  Figure  4.7,  and  place  it  in  a  transputer  with  no  other  process.  For  this  purpose,  a 
transputer  in  a  B003  board  would  be  more  appropriate,  because  we  now  are  going  to 
time  the  processor  itself  and  performance  could  be  affected  in  the  BOOl  board  by  the 
terminal_driver,  user  interface  and  so  forth.  It  is  never  repetitive  to  remember  that  with 
the  links  measurements  these  effects  were  not  so  strong  because  the  links  have  constant 
speed  of  transmission,  the  10  mbits/sec  bit  rate,  independent  of  the  processor  internal 
cvclc  and  load. 
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The  procedure  "counter"  sums  up  the  first  100,000  integers  plus  the  transputer 
number  were  it  is  located  after  receiving  a  flag,  and  send  the  result,  startime  and 
endtime  out  through  a  channel  "out". 

The  harness  where  we  placed  this  procedure  was  a  program  called  "Test 
Linearity"  that  will  be  described  now  briefly.  This  program  is  listed  in  Appendix  F  and 
includes  the  main  procedures,  Host  Proc,  Route,  and  Counter,  that  are  separate 
compilation  (SC)  and  are  placed  in  different  transputers.  The  configuration  used  for 
this  program  is  shown  on  Figure  4.8  .  The  procedure  Host  Proc  is  the  user,  keyboard 
and  screen  interfaces,  and  it  is  placed  in  transputer  root.  The  procedure  Route  is  placed 
in  transputers  00,  10,  20  and  30,  and  executes  in  parallel  the  routing  procedure  and  the 
counter.  The  remaning  transputers  (12)  are  all  executing  only  the  counter  procedure.  If 
we  look  close  to  the  topology  of  the  processors  on  Figure  4.8,  we  can  see  that  we  have 
a  tree  structure  were  the  leaves  are  only  executing  counters,  the  second  level  nodes  are 
the  routers  and  the  root  is  the  host  procedure  (hostproc).  Figure  4.9  hsts  Procedure 
route. 

2.  Results  Obtained 

The  first  measurement  done,  was  the  time  to  execute  "counter"  and  we 
obtained  130  msec,  using  the  tick. to. time  routine  to  convert  the  tick  values.  This  value 
was  obtained  in  all  16  counters  either  alone  in  a  transputer  leaf  or  inside  the  routers, 
meaning  that  the  present  level  of  communications  were  not  affecting  the  concurrent 
process  on  the  routers  CPU! 

Then,  to  enforce  a  continuous  communication,  we  placed  in  each  router, 
besides  the  flags,  and  in  parallel  with  the  counter,  three  block  transfers  to  the  three 
leaves  of  each  router  using  the  BYTE. SLICE  procedure  with  blocks  of  50,000  first,  and 
then  varying  from  70,000  to  256  bytes.  If  we  recall  from  Chapter  3  this  would  assure  us 
at  least  105  and  147  msec,  respectively  for  50,000  and  70,000  bytes,  of  continuous 
communication,  considering  the  rate  of  3.8  mbits/sec.  In  fact,  we  also  measured  in  this 
new  configuration  the  time  to  execute  the  communication  process  alone,  and  it  took 
respectively  103  and  144  msec,  so  implying  a  transfer  rate  of  3.88  mbits/sec  for  the 
three  channels  transmitting  in  parallel.  This  result  a  Uttle  bit  higher  than  the  ones 
observed  in  the  last  chapter  is  explained  for  two  reasons: 

1  The  use  of  15  Mhz  transputers  with  shorter  processes  inside  and  so  permitting 
most  use  of  internal  memory. 

2  The  bigger  external  memor>'  permitting  use  of  bigger  block  transfers. 

The  transfers  were  then  timed  in  two  modes: 

82 


PROC  route(CHAN  messagein,messageout , routetol , routeto2 , routeto3 , 

echof roml , echof rom2 , echofromS , VALUE  k)= 

DEF  i  =  4  :    number  of  counter  procedures 

VAR  msg  :      flag 

VAR  results[i]  : 

VAR  startime[i] ,endtime[i] :   timers 

CHAN  softin, softout :   soft  channels  declared  for 

communication  with  procedure  counter. 

--  SC  PROC  counter 

This  procedure  counter  is  listed  in  Figure  4.7 

SEQ 
PAR 

counter    (softout, softin, k) 
--    routing  process   itself 
SEQ 

messagein  ?  msg 
SEO 
PAR 

routetol    !   msg 
routeto2    !    msg 
routeto3    !    msg 
softout    !   msg 
PAR 

echofroml   ?    results-0- ;startime-0- ;endtime-0- 
echofrom2   ?    results-1- ;startime-l- ;endtime-l- 
echofromS   ?    results-2- ;startime-2- ;endtime-2- 
softin  ?    results-3-;    startime-3- ;endtime-3- 
--   sending  to   the   root   results   and  timing 
SEQ   i  =    [0   FOR  4] 

messageout    !    results[i] ;startime[i] ;endtime[i] : 


Figure  4.9     Procedure  Route. 

•  3  chanout,  with  three  simultaneous  BYTE. SLICE  transmissions  to  the  counters 
in  diflerent  transputers, 

•  3  in; out,  with  six  simultaneous  transfers  (3  input  and  3  output)  to/from  the 
counters  in  different  transputers. 

Note  from  the  procedure  route  code  in  Figure  4.9  that  a  flag  was  sent  to  each 
transputer  to  make  sure  they  were  ready  for  the  BYTE. SLICE  transfer,  and  then 
another  flag  was  sent  to  the  local  counter  procedure  and  so  do  the  best  possible  for  the 
communications  begin  together  with  the  local  counter  procedure. 

As  we  can  see  from  Table  26,  for  message  blocks  up  to  520  bytes,  no  effect 

was  noticed  on  the  procedure  counter!  At  this  point  no  further  investigation  has 

occurred  and  two  speculations  could  account  for  the  observed  data: 

1  May  be  after  520  bytes  long,  the  arrays  being  transmitted,  begin  to  access 
external  memory  of  the  transputers.  If  this  was  the  reason,  the  increase  of  time 
should  be  more  proportional  than  the  abrupt  increase  of  35%  more  in  time 
(46/130)  with  an  increase  of  4.6%  in  the  number  of  bytes  transmitted  (24/520) 
as  shown  in  Table  25  . 
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TABLE  26 
TIM  MING  OF  PROCEDURE  COUNTER 

a.  Message  size  50,000  bytes 

-  time  to  execute  communications  only  :  103  msec 

-  time  to  execute  procedure  counter  with: 

1.  No  communications 130  msec 

2.  With  3  chanout 186  msec 

3.  With  3  in/out 195  msec 

b.  Variable  message  size  with  3  chanout 

Timing  of  procedure  counter( msec ) 

in  the  router  in  the  leaves 

bytes         par       oripar  both  cases 

70000          190          190  191 

50000          186          186  191 

10000          178          178  175 

1000          176          176  130 

544          176          176  130 

528          156          156  130 

520          130          130  130 

<  256          130          130  130 


2  The  counter  is  being  timed  out,  when  communication  takes  more  than  1  msec 
to  finish  -  this  looks  more  reasonable  in  the  sense  that  if  the  time  slice  instead 
of  1  msec,  that  corresponds  to  485  bytes  to  be  transmitted  at  a  3.88  mbits/sec 
rate,  is  1.07msec  this  would  give  us  a  transfer  of  520  bytes  in  the  period  of  a 
time  slice  because  : 

•      (520  X  8)  /  3,880,000=  0.00107  sec  or  1.07  msec 

On  the  other  hand,  if  we  compare  the  total  execution  time  of  195  msec  for  the 
worst  case  observed  (process  being  executed  concurrently),  with  the  sum  of  the 
individual  times  necessary  for  counter  or  communications  to  accomplish  its  task  ,  233 
msec,  (See  in  Table  25  a.),  we  see  a  mismatch  of  38  msec,  when  the  two  processes 
might  be  overlapping  in  time. 

The  great  surprise,  although  was  the  unpredictable  effect  on  the  transputer 
leaves  where  we  have  the  counter  process  executed  sequentially  after  the 
communications  and  the  timing  only  begins  after  the  communications  are  over  (Table 
25  b,  last  column).   We  have  no  reasonable  explanation  for  that. 

So,  as  we  see,  no  definitive  conclusion  of  how  the  scheduling  of  the  routing 
process  and  the  counter  process  is  happening,  but  from  the  times  obtained,  there  has 
to  exist  some  overlapping,  but  not  total,  between  the  counter  and  the  routing  processes 
in  the  router  transputers.  The  results  were  consistent  on  the  four  routers. 
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Conclusion  7 

The  communication  indeed  affects  the  process  being  executed 

in  the  CPU,  for  messages  greater  than  a  threshold  size. 

For  our  example  this  value  >vas  520  bytes  or  bigger. 

Bellow  this  message  size,  communications  had  no  effect 

over  the  process  being  executed  on  the  CPU. 


This  first  conclusion  sure  lead  us  to  do  a  complete  case  study,  on  the  subject 
matter  varying  the  counter  size,  the  message  size  and  using  another  typical  process 
instead  of  the  simple  counter,  and  observe  the  effects.  It  could  be  done,  in  a  similar 
way  that  was  done  for  the  links,  but  time  did  not  permit  this  to  be  included  in  this 
thesis,  and  is  another  suggestion  for  follow-on  research. 

C.       DOES      THE      TRANSPUTER      ACHIEVE      LINEAR      PERFORMANCE 
IMPROVEMENTS? 

We  could  see  in  Chapter  3  that  the  four  Unks  in  one  transputer,  in  some  cases 
gives  us  linear  performance  improvements,  because  the  transfer  rate  per  channel  is  kept 
constant  while  we  increase  the  number  of  channels  in  parallel.  The  reader  may  recall 
Tables  9,  10  and  11  for  512  bytes  or  larger. 

If  we  now  look  into  the  process  performance,  turning  back  to  the  Test  Linearity 
program,  we  can  say  that  for  this  program,  each  counter  took  130  mec  to  execute  and 
timing  from  the  host  process  on  transputer  root  we  have  got  a  total  execution  time  of 
133  msec  since  the  first  flag  left  channels  hostoutO  to  3,  up  to  the  last  result  was 
received  back. 

A  simple  test  was  made  mapping  all  processes  assigned  to  a  B003  board  with  four 
transputers,  to  only  one  transputer.  In  this  way,  one  route  process  plus  3  counters 
would  run  in  parallel  in  only  one  CPU,  the  former  routers.  See  Figure  4.10  that  shows 
the  new  procedure  routeS  that  accompUsh  that.  The  configuration  now  was  the  same 
one  depicted  on  Figure  3.14,  with  a  different  process  placement  shown  on  the  program 
structure  on  Figure  4.11  .   The  results  obtained  are  listed  on  Table  25  . 

As  we  see  to  have  a  rigorous  linear  increase  of  performance  we  should  have: 

1  on  each  counter  time: 

•  517.5  (average)/  4  =  129.44  ,  and  what  we  had  got  was  130  msec  each! 

2  on  the  total  execution  time 

•  534  /  4  =   133.5,  and  what  we  had  got  was  133  msec! 
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PROC  routes  (CHAN  messagein,messageout, VALUE  i)= 

PROC  route(CHAN  messagein,messaqeout ,routetol ,routeto2 ,routeto3, 

ecnof roml , ecnofrom2 , echofrom3/ VALUE  k)= 
This  procedure  is  the  same  of  Figure  4.9  and  is  not  repeated. 

PROC  counter  (CHAN  in,  out,  VALUE  tnumber)= 

This  procedure  is  the  same  of  Figure  4.7  and  is  not  repeated. 

DEF  totlinks  =  32:      constant  for  soft  channel  definitions 

CHAN  pipe [totlinks] :    soft  channel  definitions 

PAR 

route  (messagein,messageout,pipe[9+(6*i) 1 ,pipe[ll+(6*i) 1 , 
pipe[13+(6'^i)]  ,pipe[8+(6^i)l  ,pipe[10+(6*i)  ]  ,pipe[12+(6^i)]  ,i) 

counter(pipe[9+(6=^i)]  ,pipe[8+(6*i)  ]  ,  ((10*i)+l)) 

counter(pipe[ll+(6*i)] ,pipe[10+(6*i) ] , ((10*i)+2)) 

counter(pipe[13+(6*i)] ,pipe[12+(6*i) ] , ((10*i)+3))  : 


Figure  4.10    Procedure  Route5. 

Another  version  of  the  Test  Linearity  Program  was  made  and  mapped  to  only 
one  transputer  T414  in  a  B003  board.  The  time  for  execution  was  then  2.3  seconds!  A 
last  version  made  for  the  OPS  system  running  on  the  VAX  VMS  run  at  best  in  8.8 
seconds! 


Conclusion  8 

With  normal  communication  load,  linear  increase 

of  performance  with  more  processors  may  be  achieved! 

The  routing  process  does  not  drag  the  processor! 
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PROGRAM  testlinearity 


Test  Performance  Linearity 
3 


Brazilian  Navy 


*  Title  : 
'*'  Version 

*  Mod  :   0 

*  Author  :   Jose  Vanni  Filho,  Lcdr, 
^  Date  :   June,  5th, 1987 

*  Programming  Language  :  OCCAM  1 

*  Compiler  :  IMS  D  600  -  TDS 

*  Brief  Description  :   This  version  of  test  linearity 
'^  mapped  into  5  transputers,  shows  us  the  increase 

*  in  time  to  execute  the  same  processes  of  version  2 

*  with  the  reduction  of  the  number  of  processors,  by 
^     a  factor  of  4. 


SC  PROC  hostproc  (CHAN  A,B , C ,D , E , F ,G,H) 

This  procedure  is  the  same  included  in  version  2  of  the  Test 

Performance  Linearity  program  in  Appendix  F  and  is  not  repeated. 

--  SC  PROC  routes  (CHAN  messagein,messageout, VALUE  i) 

This  procedure  is  the  same  of  Figure  4.10  and  is  not  repeated. 

--  configuration 
--  link  definitions 
DEF  linkOin  =  4 
DEF  linkOout  =  0 
DEF  linklin  =  5 
DEF  linklout  =  1 
DEF  link2in  =  6 
DSF  linkZout  =  2 
DEF  linkSin  =  7 
DEF  link3out  =  3 

DEF  root  =  100: 
DEF  totlinks  =  32 
CHAN  pipe [totlinks] : 

PLACED  PAR 

PROCESSOR  root 

■-  link  placements  and  process  assignment 
,   r^-.  ^^  linkOin  : 

AT  linkOout 

AT  linklin 

AT  linklout 

AT  link2in 

AT  link2out 

AT  link3in 

AT  link3out 

hostproc    (pipe[0l,pipe[2l,pipe[4l,pipe[6l , 

pipe[lj ,pipel3J ,pipe[5] ,pipe[7] ) 

PLACED   PAR   j    =    [0   FOR  4] 
PROCESSOR   10*j 

--   link  placements  and  process  assignment 
PLACE  pipe[2*j]    AT   linkOout    : 
PLACE  pipe [ (2^3 )+l]    AT   linkOin    : 

routes    (pipe[(2*j)+l] ,pipe[2*j] , j) 


PLACE 

pipe 

'0 

PLACE 

pipe 

1 

PLACE 

pioe 

2 

PLACE 

pipe 

3 

PLACE 

pipe 

4 

PLACE 

pipe 

b 

PLACE 

pipe 

6 

PLACE 

pipe 

7 

Figure  4.1 1     Structure  of  Program  Test  Linearity  (5). 
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TABLE  25 

COMPARING  COUNTER  EXECUTION  TIME  IN  4  AND  16 
TR.\NSPUTERS  NETWORK 

16  transputers  NR     4  transputers 

NR 

counter 

00 

130 

msec 

00 

520 
518 
517 
515 

msec 
msec 
msec 
msec 

00 
00 
00 

00 

counter 

01 

130 

msec 

01 

counter 

02 

130 

msec 

02 

counter 

03 

130 

msec 

03 

counter 

10 

130 

msec 

10 

520 
519 
517 
515 

msec 
msec 
msec 
msec 

10 
10 
10 
10 

counter 

11 

130 

msec 

11 

counter 

12 

130 

msec 

12 

counter 

13 

130 

msec 

13 

counter 

20 

130 

msec 

20 

520 
519 
517 
515 

msec 
msec 
msec 
msec 

20 
20 
20 
20 

counter 

21 

130 

msec 

21 

counter 

22 

130 

msec 

22 

counter 

23 

130 

msec 

23 

counter 

30 

130 

msec 

30 

520 
519 
517 
515 

msec 
msec 
msec 
msec 

30 
30 
30 
30 

counter 

31 

130 

msec 

31 

counter 

32 

130 

msec 

32 

counter 

33 

130 

msec 

33 

1 

Total  Execution 
( timed  on  bOOl) 

133 

msec 

534 

msec 
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V.  CONCLUSION 

When  rhis  research  begun,  in  October  1986,  we  had  a  new  machine,  working  with 

a  language  that  we  did  not  know,  and  using  a  concept  that  still  today  is  considered 

hard  to  grasp  and  to  work  with  :  Concurrency  and  Parallelism.  After  working  for  eight 

months  with  the  transputer,  the  first  conclusion  that  come  up  is  : 

•      Concurrency  and  Parallelism  are  not  difficult  concepts  to  understand  at  all, 
using  the  Transputer  and  the  Occam  Programming  Language. 

In  this  first  phase  of  the  research,  the  evaluation  of  the  Transputer  hardware, 
several  significant  conclusions  were  reached  and  they  are  summarized  in  the  following 
paragraphs,  that  were  obtained  from  the  body  of  the  thesis.  They  give  us  a  good  first 
idea  of  the  real  potential  and  capabilities  of  the  Transputer  when  programmed  in 
Proto-Occam. 

The  bit  rate  in  the  links  is  switchable  between  10  mbits/sec  and  20  mbits/sec. 
When  operating  at  10  mbits/sec  rate,  the  data  rate  was  at  best  3.8  mbits/sec  or  450 
kbytes  sec,  per  channel.  So,  the  eight  links  will  be  able,  in  the  best  case,  to  exchange 
3.8  mbytes  of  data  in  one  second,  between  two  adjacent  transputers,  because  the  links 
are  really  able  to  operate  in  parallel.  We  shall  remember  that  to  obtain  this  results,  wq 
need  to  use  the  BYTE  SLICE  or  WORD  SLICE  constructs,  with  messages  larger  than 
256  bytes.  Equally,  when  switched  to  20  mbits  /sec  rate  the  maximum  data  rate 
obtained  was  6.1  mbits'sec. 

When  a  computation  bound  process  is  running  in  the  cpu,  with  the  same  priority 
as  the  routing  process,  it  will  reduce  the  transfer  rate  on  the  links  for  any  construct,  at 
least  8%  for  one  channel  operating,  and  21%  for  any  other  number  of  channels.  These 
results  were  observed  for  message  size  10,000  bytes  or  smaller. 

On  the  other  hand,  if  we  give  high  priority  to  the  communications,  the  cpu 
process  will  be  executed  in  the  same  way,  and  the  communications  will  keep  the 
previously  obtained  rate  of  3.8  mbits/sec,  so  this  is  strongly  recommended. 

Communications  in  the  links  will  reduce  the  performance  of  a  process  being 
executed  in  the  same  CPU,  when  message  sizes  overcome  a  threshold  size,  depending 
on  the  process  type.  For  our  observed  case  this  value  was  520  bytes.  For  larger 
message  sizes,  the  maximum  reduction  in  performance  for  the  computation  bound 
process  was  50%  in  the  worst  case  (Six  channels  operating  in  parallel). 
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The  transputer  is  able  to  increase  throughput  linearly  with  the  increase  of  the 
number  of  transputers  in  which  the  process  is  executed. 

Although  \ery  promising,  these  conclusions  are  not  complete  and  here  follows 
some  suggestions  for  follow-on  work  in  the  evaluation: 

1  To  investigate  the  usage  of  the  internal  memory  by  the  processor,  specially  if 
priority  is  given  for  data  or  program  execution  code,  to  be  placed  in  internal 
memor}'. 

2  To  investigate  how  the  scheduler  handles  long  communication  processes  that 
are  consuming  more  than  one  time  slice. 

3  To  use  a  Logic  State  Analizer  capable  to  sample  in  a  clock  rate  of  50  or  100 
mhz.  to  more  precisely  measure  the  time  delays  involved  in  the  receipt  of  a 
frame  and  dispatch  of  respective  acknowledge. 

4  To  time  the  amount  of  time  needed  for  an  array  of  variable  size  to  be 
transmitted  through  several  transputers  to  a  non  adjacent  destination. 

5  To  use  the  Link  Evaluation  Program  with  greater  message  sizes.  This  would 
imply  in  using  B003  boards  that  have  256  kbytes  available,  per  transputer, 
instead  of  the  64  kbytes  available  at  the  BOOl  board,  or  a  replacing  the  BOOl 
board  by  another  board  with  larger  external  memory. 

6  To  make  a  thorough  study  of  the  effect  of  link  operation  over  a  computation 
bound  process. 

7  To  benchmark  a  network  of  transputers  configured  in  a  hypercube  with  the 
commercially  available  hypercube  computers,  like  the  Intel  IPCS-VX,  using  the 
Operating  System  presented  by  Cordeiro  [Ref.  6]. 

Another  suggestion  for  research  is  the  development  of  real-time  application 
programs  to  observe  the  behavior  of  the  machine  under  normal  work  load  situations. 

It  is  important  to  mention  at  this  point  that,  as  advertised,  we  could  indeed  use 
transputers  with  different  internal  clock  cycle,  communicating  with  each  other  with  no 
problems  at  all. 

Equally  important  is  to  remember  that  in  all  results  obtained  in  this  research,  we 
were  using  bytes  or  integers,  with  no  floating  point  operations.  So  one  other 
recommended  topic  for  investigation,  is  the  link  and  processor  performance  evaluation 
for  floating  point  data.  This  could  be  done  in  two  ways: 

1  By  using  software  floating  point  available  in  Occam  2,  or 

2  By  using  the  hardware  floating  point  that  will  be  available  with  the  T-SOO 
transputer. 

If  we  could   state,   our  impression  about   the  transputer,   the   small   size,   the 

simplicity  and  the  speed  are  the  things  that  really  stood  out. 
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As  a  final  suggestion,  to  enlarge  the  research  horizons  at  the  NPS,  we 
recommend  the  replacement,  when  possible,  of  the  the  BOOl  board,  interfacing  with  the 
VAX.  that  turned  out  to  be  a  bottleneck  for  our  160  MIPS  capable  Transputer  System, 
either  in  processing  speed,  or  in  memory-  availability. 

Occam  is  a  ven,'  easy  language  to  use,  the  fold  editor  is  very  powerful  and 
friendly,  and  the  channels  are  very  good  elements  for  synchronizing  processes.  But  as 
soon  as  the  Ada  compiler  becomes  available,  the  research  should  follow  that  way  and 
then,  comparison  with  the  previously  obtained  performances,  will  be  helpful  in  judging 
the  appHcability  of  the  Transputer  in  miUtary  real-time  systems. 
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APPENDIX  A 
LEARNING  SEQUENCE 

a.  How  to  Log  in 

The  first  thing  one  will  need  is  an  account  on  the  VAX- VMS  to  use 
OCCAM. 

There  is  a  group  account  username  "OCCAM"  ,  and  through  the  C.S. 
Department  staff  one  can  get  a  sub-account  to  it. 

Once  a  person  gets  a  sub-account,  one  shall  have  a  password  and  a  login 
name  (normally  the  last  name).  With  this,  one  should  go  to  a  terminal  VT  100  or  VT 
220  (no  other  terminal  will  work  !),  log  in,  and  as  soon  as  the  "S"  prompt  appears,  the 
VAX/ VMS  System  is  ready  to  begin. 

If  by  any  chance,  the  person  already  has  one  account  in  the  VAX/VMS 
system,  what  he/she  may  want  to  do  is  to  work  from  his  own  account.  That  will  be 
possible,  but  as  soon  as  the  S  appears  and  before  one  tries  to  use  any  of  the  OPS 
or/and  TDS  commands  one  should  type  either: 

•  opssetup  ">  to  use  the  OPS  system,  or 

•  tdssetup  ">  to  use  the  TDS  system. 

These  commands  are  already  included  in  the  login.com  file  of  the  OCCAM 
account  and  it  is  a  good  idea  for  one  to  include  them  in  one's  login.com  file  too. 
Another  thing  one  may  need  to  do  is  to  move  to  the  "duaO:[OCCAM]"  directory  to 
copy  files  and  libraries  already  created  and  that,  certainly  will  be  useful  and  save  time 
for  anybody. 

b.  Learning  Sequence 

1.  Step  1 

The  first  thing  one  needs  to  know  is  how  to  use  the  VMS  Operating 
System.  One  good  choice  is  to  run  the  online  tutorial  VMSCAI  and/or  get  a  VMS 
tutorial  from  the  C.  S.  Department  [Ref  22].  If  the  person  is  completely  unexperienced 
it  will  take  two  sessions  of  two  hours  each,  to  get  a  good  feeling  for  it. 

2.  Step  2 

When  one  feels  comfortable  using  the  VAX/ VMS,  the  next  step  is  to  get 
acquainted  with  the  fold  editor.  This  is  a  very  powerful  editor  but  most  likely  it  will  be 
new  for  anybody,  and  if  one  needs  more  information  on  it,  he/she  should  refer  to  the 
Occam  Programming  System  Manual  [Ref  13:  section  2]. 
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To  execute  the  tutorial : 

copy    from    the    OCCAM    account    the    file    "0PSTUT0R.DOC"    using    the 
following  comniands  at  the  S  prompt: 

•  "set  default  duaO:Occam  "  (this  will  move  you  to  OCCAM  director^') 

•  "opscopy   opstutor.doc  [.your_director\']" 

•  "set  default  [.your  directory]  (to  move  back) 

type  :  "  ops  opstutor.doc"  at  the  S  prompt  in  your  director\' 

This  will  open  the  opstutor.doc  file  and  will  appear  on  the  screen  on  the 


upper  left 


"Press  -ENTER  FOLD-  to  start  session" 
"...F  OPSTUTOR.DOC" 


At  this  point  one  should  press  the  key  "0"  and  while  pressing  it  press  also 
key  "1"  (both  keys  are  on  the  numerical  keypad  on  the  right  side  of  your  keyboard. 
This  is  the  ENTER  FOLD  command.  From  here  on  just  follow  the  on  screen 
instructions. 

It  is  likely  one  will  need  about  two  hours  for  the  first  time,  but  as  one 
keeps  using  the  editor  he/she  will  find  it  most  easy  and  powerful.    It  is  a  must  to  have 
a  card  with  a  xerox  copy  off  the  keypad  description  codes.  See  Figure  A.l^^ 
5.   Step  3 

Learning  the  Occam  language  is  the  next  thing  to  do. 

One  may  even  begin  reading  the  Occam  Programming  Manual 
[Ref  13,:  section  3]  or  Pountain's  book  [Ref  12]  early  in  the  learning  process,  if 
desired.  If  the  reader  knows  any  other  structured  language  such  as  PASCAL,  ADA,  or 
C  it  will  be  most  easy.  It  is  very  important  to  get  a  good  grasp  of  the  channel  concept! 

4.  Step  4 

At  this  point  it  would  be  good  one  know  some  thing  about  the  transputer 
hardware,  and  architecture.  The  Transputer  Reference  Manual  is  the  reference,  but  the 
technical  notes  from  INMOS  or  the  existing  theses  will  also  help. 

5.  Step  5 

At  this  point  one  have  a  choice  of  learning  one  of  the  three  systems 
available  at  the  NTS:  OPS,  TDS  for  the  VAX,  or  TDS  for  the  PC.  They  are  a  little  bit 
different  and  a  good  choice  for  the  beginner  will  be  the  OPS.  This  will  enable  the 
person  to   use   the   Occam  language   for  create   concurrent  programs,   that  will   be 


^^  Reproduced  By  Permission  of  INMOS  Corporation 
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Figure  A.l     Keypad  for  Using  the  Fold  Editor. 

compiled,   linked,   debugged  and  run   on   the  VAX.   The  OPS   Manual  is  the  main 
reference  for  it. 

6.   Step  6 

After  that  then,  depending  on  which  system  you  will  work  you  should  learn 
the  TDS  for  the  VAX  or  for  the  PC.  The  reference  manuals  respectively  are  the  main 
reference,  but  the  Theses  by  Cordeiro  or  Vanni  present  several  hints  and  suggestions 
that  may  help.  With  respect  to  Occam  the  only  dilTercnt  skill  one  will  need  is  how  to 
make  configurations.  Again  both  theses  will  help. 
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APPENDIX  B 
OPS  TUTORIAL 

/.   Introduction 

This  appendix  will  describe  briefly  how  to  use  the  OPS  system,  resident  on 
the  VAX,  VMS,  to  write  a  program,  compile,  link  and  execute  it.  It  will  not  be  a 
complete  description  of  the  system  and  it  assumes  the  reader  already  knows  how  to  use 
the  Fold  Editor  and  the  VMS  Operating  System  in  the  VAX,  and  had  already  been 
exposed  to  the  Occam  Language.  The  main  reference  is  the  Occam  Programming 
System  Manual. 

2.  The  Existing  File  Types: 

In  OPS  there  are  several  user  file  types  identified  by  the  file  extension: 

•  ".ops"  -  these  are  source  files,  folded,  that  may  be  edited,  and  once  in  the 
program  format,  may  be  compiled.  These  can  not  be  printed. 

•  ".lis"  -  these  are  listing  files  that  may  be  used  as  a  VMS  file  for  any  purpose. 
The  copy,  type,  print  commands  on  this  operating  system  work  with  no 
problem. 

•  ".obj"  -  these  are  object  files  that  were  already  compiled.  They  may  be  linked  to 
make  an  .exe  file.  They  are  not  printable. 

•  ".exe"  -  these  are  executable  files  that  were  compiled  and  linked  already.  They 
also  can  not  be  printed. 

3.  To  Start  the  System 

Once  one  is  logged  on  the  VAX/ VMS  on  a  terminal  VT-100  or  VT220,  the 
first  command  to  type  is: 

•  opssetup  -  this  will  enable  all  the  following  commands  used  in  the  OPS  to  be 
recognized  by  the  VMS  Operating  System,  through  the  "ops  kernel"  (opskrnl) 
resident  on  the  Systems  Directory. 

4.  To  Open  a  File 

Type: 

•  ops  "filename"  -  this  command  may  be  applied  to  any  ".ops"  file  and  will  make 
the  file  available  to  be  edited  with  the  fold  editor.  Every  time  one  exit  the 
outermost  fold,  a  new  version  will  be  created  on  the  VMS  file  System.  Keep 
track. 
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5.  To  Make  a  Procedure  or  a  Program 

The  procedures  and  programs  have  a  similar  structure  as  in  Pascal  or  Ada. 
After  the  procedure  name  with  parameters  follows  the  constant,  variables  and  channels 
declarations  and  procedures  defined  only  inside  that  procedure  (subprocedures),  and  at 
last  the  main  program  that  may  begin  with  an  WHILE.  IF,  SEQ,  ALT,  PAR,  PRI 
PAR,  a  replicator,  and  so  on  and  terminate  with  a  colon(:).  The  best  is  to  make  all 
procedures  with  separate  compilation  (SC)  capability,  and  for  that  we  should  apply  the 
utility  MAKE  SC  PROC  to  the  procedure  fold  Une. 

The  program  has  no  parameters  and  no  colon  at  the  end,  but  the  structure 
is  the  same  as  described  for  the  procedure.  It  is  important  to  say  that  in  Occam  one  is 
not  obliged  to  declare  all  constant,  variables  and  channels  at  the  beginning  of  the 
procedure.  It  may  be  done  before  any  process.  A  process  begins  with  any  of  the  above 
mentioned  constructs.  The  best  way  to  learn  is  to  look  at  ready  programs  so  we  will 
stop  this  section  here.  When  we  use  the  utility  MAKE  PROGR.AM  the  name  program 
will  automatically  appear  in  front  of  your  program  name. 

The  global_defmitions  and  library  are  very  useful  to  easily  make  programs, 
and  it  is  a  good  idea  to  put  them  in  any  program. 

6.  To  Compile  a  Ready  Program  or  Procedure 

Any  PROGRAM  or  SC  PROC  may  be  compiled  separately,  as  long  as  the 
utilities  "MAKE  PROGRAM"  or  "MAKE  SC  PROC"  respectively,  were  applied  to 
them  and  no  error  message  occurred.  To  execute  the  compilation,  the  cursor  has  to  be 
in  a  folded  line,  with  a  PROGRAM  or  and  SC  PROC  inside,  and  then  the  user  should 
execute  the  utility  COMPILE.  The  system  will  prompt  for  the  object  file  name  and  it  is 
a  good  idea  to  use  the  same  name  of  the  source  file. 

7.  Debugging  a  Program  During  Compilation 

The  compiler  is  quick  and  every  time  one  gets  a  compilation  error,  the  error 
description  appears  on  top  of  the  screen  and  the  cursor  is  placed  on  the  line  where  the 
error  occurred,  or  one  before.  The  System  will  be  in  edit  mode  and  the  error  may  be 
corrected  at  once.  After  correcting  and  exiting  the  fold,  one  will  be  ready  to  compile 
again,  neatly  and  cleanly. 

8.  To  Link  a  Program 

After  the  object  file  was  created  with  the  compilation,  one  has  to  leave  the 
Fold  Editor  and  at  the  VMS  prompt  (S)  type: 
•      link/debug  opskrnl,  program_name 
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The  debug  is  optional  and  we  did  not  use  it  too  much,  but  we  can  say  it 
runs  and  permits  one  to  trace  a  program  execution.    After  the  hnking  the  .exe  file  will" 
he  created,  and  one  is  then  able  to  run  the  program. 
9.    To  Run  a  Program 

After  the  link  was  done  successfully,  one  should  type  after  the  S  prompt: 
•      run  ,  debug   program_name 

Again  the  debug  is  optional  and  after  this  command  the  program  will  be 
running  on  the  VAX.  If  logical  errors  occur,  the  two  options  are  either  to  use  the  VAX 
on-line  debugger  or  get  back  to  the  source  code  (the  .ops  file)  and  place  some  output 
to  screen  {  Screen  !  var  ). 
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Figure  B.l     OPS  Utilities. 

10.    To  List  a  Program 

There  are  two  ways  to  do  that: 
The  first  one  is  under  VMS,  one  shall  use  the  OPS  command: 
•      opslist  filename. ops  filename. lis 

This  will  create  a  list  file  on  filename. lis  to  be  printed  on  the  VAX  on-line 
printer.  Be  careful  here!  If  one  forgets  to  put  a  "filename. lis"  in  the  command, 
the  source  file  will  be  transformed  in  a  list  file  with  the  wrong  termination.  And 
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worse,    if  by   chance    one    purges    the    director^',    all    the   fold    structure    the 

programer  created  will  be  destroyed  and  will  have  to  be  redone,  if  one  needs  to 

compile  the  program  again. 

2      The  second  one  is  under  OPS;  one  shall  use  utility  LIST.  This  may  be  applied 
to  any  fold  inside  the  program  and  the  user  will  be  prompted  for  a  file  name. lis  . 

It  is  important  to  mention  that  every  time  one  lists  a  file,  the  folds  will  be 
opened,  and  appear  sequentially.  It  is  not  very  easy  for  a  begginer  to  follow  a  printout 
of  the  file.  The  fold  editor  permits  us  a  much  better  block  view  of  what  the  program 
looks  like.  So  most  Ukely  if  one  has  a  very  hard  bug  to  solve,  debugging  from  the 
screen  will  be  easier. 

//.   Final  Remmarks 

There  are  other  commands  and  utilities  that  after  a  while  one  may  need  to 
use,  but  for  the  very  beginning,  the  ones  hsted  here  will  suffice.  Figure  B.l  show  all 
OPS  Utilities  and  how  to  call  them,  on  a  VT-100  Terminal.  The  FUNC  means  that  one 
should  press  the  0  key  at  the  numerical  keypad  and  the  number  on  the  keyboard  (NOT 
PF  KEYS).    Figure  &firstpro  presents  a  simple  program  as  an  example. 
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--  PROGRAM  hellol 
--  hellol 

This  is  a  the  first  program  in  OPS  to  be  seen  by  a  beginner. 

This  fold  contains  a  simple  occam  program  v;hich  says  hello. 

After  the  message  appears  on  the  screen  you  can  type  any 

character. It  v/ill  be  echoed  on  the  screen  (no  automatic 

line  feed  or  carriage  return. 

When  you  type  "0"  the  program  ends. 

--   declarations 

DEF  hello  =  "hello!  press  0  to  stop  running": 

DEF  EndBuffer  =  -3:   system's  constant 

CHAN  Screen  AT  1:      system's  channel 

CHAN  Keyboard  AT  2:    system's  channel 

VAR  ch : 

VAR  going:  —  Boolean 


SEQ 


mam  program 


SEQ  i  =  [1  FOR  hello[BYTE  0]] 

Screen  !  hello[BYTE  i] 
Screen  !  EndBuffer  —  EndBuffer  needed  when  outputting  strings 
going  :=  TRUE 
WHILE  going 
SEQ 

Keyboard  ?  ch 

Screen  !  ch  ;EndBuffer 

IF 

ch  =  #30  ---  Hex  value   for  ASCII   0 

going    :=  FALSE 
TRUE 
SKIP 


Figure  B.2    First  Program  in  OPS. 
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APPENDIX  C 
TDS  TUTORIAL 

1.  Introduction 

This  appendix  will  describe  briefly  how  to  use  the  TDS  system,  resident  on 
the  VAX/VMS  at  the  NPS,  to  edit,  compile,  down  load  and  execute  an  Occam 
program.  It  will  not  be  a  complete  description  of  the  system  and  it  assumes  the  reader 
already  knows  how  to  use  the  OPS  System,  the  F"old  Editor  and  the  VMS  Operating 
System  in  the  VAX.  The  main  reference  is  the  Transputer  Development  System 
Manual,  D-600. 

2.  The  Existing  File  Types', 

In  TDS  there  are  several  user  file  types  identified  by  the  file  extension: 

•  ".tds"  -  these  are  source  files,  folded,  that  may  be  edited,  and  once  in  the 
program  format,  may  be  compiled.  These  can  not  be  printed. 

•  ".1st"  -  these  are  Usting  files  that  may  be  used  as  a  VMS  file  for  any  purpose. 
The  copy,  type,  print  commands  on  this  operating  system  work  with  no 
problem.  Originally  the  extension  was  ".lis",  but  we  suggest  the  programmer  to 
use  other  termination  in  order  to  identify  the  file. 

•  ".ted"  -  these  are  "transputer  code"  files  originated  from  an  extraction  after  a 
compilation  was  successfully  completed.   They  are  not  printable. 

•  ".cde"  -  these  are  non-  executable  files  that  were  compiled  and  extracted  already. 
They  will  exist  when  the  programmer  uses  closed  files  inside  his  program,  and 
contain  the  code  for  a  file.  They  are  not  printable. 

•  ".dsc"  -  these  are  descriptor  files  and  will  exist  only  when  the  programmer  used 
closed  files  in  his  programs.  They  are  not  printable. 

5.    To  Start  the  System 

Once  one  is  logged  on  the  VAX/VMS  on  a  terminal  VT-100  or  VT220,  the 

first  command  to  type  is: 

•  tdssetup  -  this  will  enable  all  the  following  commands  used  in  the  TDS  to  be 
recognized  by  the  VMS  Operating  System,  through  the  "ops  kernel"  (opskrnl) 
resident  on  the  Systems  Directory. 

4.    To  Open  a  File 

Type: 

•  tds  "filename"  -  this  command  is  to  be  applied  to  any  ".tds"  file  and  will  make 
the  file  available  for  editing  with  the  fold  editor.  Every  time  you  exit  the 
outermost  fold,  a  new  version  will  be  created  on  the  VMS  file  System.  Keep 
track. 
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5.  To  Make  a  Procedure  or  a  Program 

The  procedures  and  programs  have  a  similar  structure  as  in  OPS,  so  they 
will  not  be  repeated  here. 

The  global_dermitions  and  library  are  very  useful  to  make  programs  easily, 
and  it  is  a  good  idea  to  put  them  in  any  program.There  are  two  global_defmitions,  one 
for  each  of  the  systems  identified  by  the  extension.  Be  careful  to  imbed  in  your 
program  the  "global_def  tds." 

There  are  two  different  things  from  OPS  in  a  program  for  the  TDS: 

1  The  first  is:  to  see  any  result  on  the  screen,  one  must  include  inside  the  program 
the  terminal  driver,  provided  by  INMOS,  for  the  board  that  one  is  using  (BOOl, 
B002  or  B004). 

2  The  second  is  the  need  for  a  configuration.  The  configuration  basically  gives 
names  to  the  physical  channels  and  places  in  each  transputer  the  process  to  be 
executed  there.  Rather  than  try  to  explain  here,  the  best  is  to  browse  some  of 
the  several  configurations  existing  in  the  Theses  by  Vanni  or  Cordeiro,  or  in  the 
programs  already  existing  in  the  Group  account  Occam. 

6.  To  Compile  a  Ready  Program  or  Procedure 

Any  PROGR.A.M  or  SC  PROC  may  be  compiled  separately  as  long  as  the 
utilities  "MAKE  PROGRAM"  or  "MAKE  SC  PROC"  respectively,  are  applied  to  them 
and  generate  no  error  message.  To  execute  the  compilation,  the  cursor  has  to  be  in  a 
folded  line  with  a  PROGRAM  or  and  SC  PROC  inside,  and  apply  the  utility 
COMPILE.  There  will  be  no  prompt  at  this  time,  except  for  the  compilation 
parameters.  If  the  program  has  complicated  nesting  of  PAR  and  ALT  constructs,  use 
CHECK  =  FALSE. 

7.  Debugging  a  Program  During  Compilation 

The  compiler  is  quick  and  every  time  one  gets  a  compilation  error,  the  error 
description  appears  on  top  of  the  screen  and  the  cursor  will  be  placed  on  the  line  where 
the  error  occurred,  or  one  before,  in  edit  mode  and  the  error  may  be  corrected  at  once. 
After  corrected,  exit  the  fold  and  one  will  be  ready  to  compile  again.  Neat  and  Clean. 

8.  To  Extract  the  Code  to  Be  Executed  in  the  Transputer 

The  compilation  will  create  several  folds  inside  the  .ops  program  containing 
the  descriptor  and  the  code  to  be  executed.  To  extract  the  code  execute  utihty 
EXTR.-\CT  TO  FILE  ".  At  this  point  one  will  be  prompted  for  a  filename  to  extract, 
and  we  strongly  suggest  to  use  the  same  name  of  the  source  file. 
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9.  To  Down  Load  and  Run  a  Program 

Once  the  ".ted"  file  was  created,  the  user  will  be  ready  to  run  the  program 
on  the  transputer  network.  Before  you  down  load,  check  the  wiring  diagram  (Utility  7), 
and  see  if  the  links  are  properly  connected.  After  this,  exit  from  the  fold  editor  and 
execute  at  the  VMS  (S)  prompt  : 

IDS  LOAD  filename.tcd 

What  will  happen  is  that  the  file  will  be  opened  by  the  VAX  and  the 
programmer  will  be  prompted  for  the  escape  sequence  (  normally  is  ESC  ESC  ESC). 
After  typing  the  escape  sequence  the  transputer  become  active  and  the  code  is  loaded. 
Check  the  Manual  if  any  Error  message  occurs.  After  the  program  is  down  loaded,  it 
will  be  executed  at  once,  with  no  need  of  any  other  intervention  of  the  user.  To  stop 
the  transputer  press  reset  at  the  BOOl  board. 

10.  To  List  a  Program 

There  are  two  ways  to  do  that: 

1  The  first  one  is  under  VMS,  one  shall  use  the  OPS  command: 
•      opshst  filename,  ops  filename.lst 

This  will  create  a  list  file  under  filename.lst  to  be  printed  at  the  VAX  on-line 
printer.  Be  careful  here!  If  one  forgets  to  put  a  "filename.lst"  on  the  command, 
the  source  file  will  be  transformed  in  a  list  file  with  the  wrong  termination.  And 
worse  if  by  chance  one  purges  the  directory,  all  the  fold  structure  the 
programmer  created  will  be  destroyed  and  will  have  to  be  redone,  if  one  needs 
to  compile  the  program  again. 

2  The  second  one  is  under  TDS;  one  shall  use  utility  LIST.  This  may  be  applied 
to  any  fold  inside  the  program  and  the  user  will  be  prompted  for  a 
"filename. lis".  We  suggest  the  termination  to  be  changed  to  ,1st  to  differentiate 
from  the  OPS  list  files. 
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//.    Final  Remarks 

There  are  other  commands  and  utiUties  that  after  a  while  one  may  need  to 
use,  but  for  the  very  beginning,  the  ones  listed  here  will  suffice.  Figure  C.l  shows  the 
utilities  for  the  TDS  Svstem. 
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Figure  C.l     The  Utilities  for  the  TDS  System. 
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APPENDIX  D 
HINTS  ABOUT  OCCAM  PROGRAMMING 

The  goal  of  this  appendix  is  two  fold.  First  to  mention  some  different  and 
interesting  facts  that  happened  to  us  and  may  happen  to  anyone  programming  for  the 
first  time  in  Occam,  and  second  to  make  some  comments  about  the  Link  Evaluation 
Program. 

a.  Program  Structure 

The  program  structure  for  OPS  and  TDS  is  quite  similar,  just  differing  in  the 
global  definitions  ,  configuration,  and  some  predefined  procedures.  The  difference  in 
the  global  definitions  is  a  very  critical  one.  While  in  the  OPS  we  place  the  CHAN 
Screen  AT  1  and  CHAN  Keyboard  AT  2,  in  TDS  we  just  declare  CHAN  Screen:  and 
CHAN  Keyboard:,  because  the  Screen  and  Keyboard  handling  is  done  throughout  the 
terminal_driver. 

The  configuration  section  of  a  program  is  the  one  were  we  map  the  physical 
channels  and  the  processes  onto  the  processors,  and  it  only  exists  for  the  TDS  system. 

The  pre-defined  run  time  procedures  are  described  in  detail  in  the  TDS  manual 
and  the  OPS  manual,  but  they  only  can  be  used  with  the  TDS.  Some  examples  are: 
the  BYTE.SLICE.INPUT,  PUT.BYTE,  READ. BYTE,  WORD.SLICE.OUTPUT,  etc.. 

/.   A  program  in  OPS 

Figure  D.l  describes  the  structure  of  an  OPS  program. 


PROGRAM  progname 

global_def . ops  (collection  of  system  defined  constants) 
library. occ  (if  wanted) 

any  procedures  used  inside  your  program   (optional) 

SC  PROC  1  any  separate  compilation  procedure  that  one  may 

SC  PROC  2  refer  to  and  call  from  the  main  of  the  program 

PROC  3   (parameters......  )   procedures  called  by  the  main. 

local  definitions  for  the  main 

main 

SEQ   ---    it  could  be   PAR,    ALT,    WHILE  TRUE,    IF  or  a   replicator 
code 


Figure  D.l     OPS  Program  Structure. 
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2.   A  Program  in  TDS 

Figure   D.2   describes   the   structure   of  a   TDS   program  showing   as   an 
example  the  structure  of  the  LINK  EVALUATION  PROGRAM. 


PROGRAM  link. evaluation 

each  one  of  the  following  procedures  have  the  same  structure  as 

depicted  on  figure  D.l 

SC  PROC  hostproc  (parameters)  code  for  transputer  root 

SC  PROC  transferO.BOOS  (parameters)  code  for  transputerO 

SC  PROC  transferl .B003  (parameters)  code  for  transputerl 

SC  PROC  transfer2_B003  (parameters)  code  for  transputer2 

SC  PROC  transfer3_B003  (parameters)  code  for  transputer3 

configuration 

. . .  link  definitions 

. . .  physical  channels  declaration 

PLACED  PAR 

PROCESSOR  ROOT   ROOT  =  100  (one  may  use  any  process  number) 

...channel  placements  (physical  placement  of  the  channels 

(according  the  network  topology) 
hostproc  (physical  channel  parameters) 

the  process  hostproc  is  the  outermost  placed  on 

transputer  root  and  has  to  be  an  SC  PROC 

PROCESSOR  0  Like  shown  for  transputer  root,  in  each  of 

PROCESSOR  1  the  processors  it  is  made  a  physical  channel 

PROCESSOR  2  placement  and  a  process  placement. 

PROCESSOR   3    --- 


Figure  D.2    TDS  Program  Structure  Example. 

The  two  Figures  D.l  and  D.2  give  to  the  reader  an  idea  of  the  general 
structure  of  an  OPS  and  a  TDS  program.  Normally,  the  terminal  driver  is  one  of  the 
SC  PROCS,  inside  the  process  placed  in  the  transputer  root,  to  permit  user  and  screen 
interaction.  Cordeiro  [Ref  6],  describes  in  detail  how  to  make  a  configuration  and  how 
to  map  a  program  made  for  OPS  into  the  TDS  system,  and  therefore  it  will  not  be 
addressed  here.  Again,  the  best  way  to  begin  programming  in  Occam  is  to  look  at 
sample  programs  already  made. 
b.  Problems  and  Suggestions 
/.   Setting  up  Some  Standards 

Early  in  the  learning  process  we  felt  necessary  to  standardize  some  of  our 

procedures  when  programming.  This  may  be  not  the  best,  but  this  is  what  we  came  out 

after  several  changes  through  the  research  process,  and  is  given  as  a  suggestion  only: 

•      Use  all  your  procedure  and  variable  names  in  lower  case.  The  system  has  some 
predefined  variables  like  "EndBufTer",  and  all  reserved  words  are  uppercase.  So 
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doing  this,  one  will  not  have  problems  of  naming  because  both  the  OPS  and 
TDS  are  case  sensitive.  For  example  you  may  use  a  variable  named  "true"  and 
no  problem  with  the  system  defined  "TRUE"  will  occur. 

When  in  the  code  one  has  a  replicator  with  multiple  statements  under  it  use 
always  a  SKIP  as  shown  in  Figure  D.3  .  That  will  make  certain  that  the  last 
index  value  is  executed. 

In  programs  with  repetitive  interactions  with  the  user,  use  a  new.line  after  each 
execution  and  before  the  new  prompt  to  the  user 

(Keyboard  ?  var).  This  will  prevent  unwanted  multiple  executions. 

Every  time  a  comment  is  placed  in  the  code,  use  at  least  3  dashes.  This  will 
enable  one  to  recognize  easily  in  the  printout,  what  is  comment,  and  what  is  the 
beginning  of  a  fold. 


SEQ    i   =    [0    FOR   5] 
SEQ 

in  ?   varl 

out    !    varl   +   1 
SKIP       this   is   the  SKIP  we   felt  necessary 


Figure  D.3     SKIP  Usage. 

2.  When  Making  Any  Procedure 

In  order  to  permit  any  procedure  to  run  in  parallel  (always),  with  any  other 
process,  use  as  much  channels  as  possible  as  parameters,  instead  of  VAR  or  VALUES. 
The  channels  will  enable  the  programmer  to  exchange  data  between  two  procedures 
without  a  procedure  call.  This  is  the  key  for  the  parallelism.  One  good  example  were 
this  was  used  is  the  procedure  cpubusysum,  in  Appendix  E.  Other  examples  can  be 
seen  in  the  library  routines  defined  inside  the  procedure  getchoice,  also  in  Appendix  E. 

Also  make  the  procedures,  SC  PROCs,  as  much  as  possible.  This  is  better 
for  the  programmer  because  if  an  compilation  error  occurs,  it  will  be  detected  earlier 
and  the  recompilation  time  will  be  shorter.  It  is  also  better  for  the  compiler  because  it 
stays  away  from  the  compilation  limit. 

3.  When  Compiling 

When  compiling,  several  errors  may  be  flagged.  If  an  error  message: 

•  "...  shared  variable  varname"  ,  occurs,  change  the  check  compilation  parameter 
to  false.  When  check  is  true  even  the  output  of  the  same  variable  to  several 
different  channels  in  parallel,  will  make  the  compiler  flag  the  error,  when  it  does 
not  exist  really. 
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If  any  errors  occur,  the  compiler  will  position  the  cursor  always  before  the 
error  exact  position.  Sometimes  the  error  will  be  on  the  same  line,  and  sometimes  in 
the  next  line  of  code. 

4.  IVhen  Making  Large  Programs 

When  making  large  programs,  one  should  take  care  of  the  compiler  code 
limit  either  for  OPS  or  TDS.  In  the  VAX  this  limit  is  around  100  blocks,  or  50  kbytes 
of  code.  To  get  around  this  problem,  one  should  make  some  procedures  inside  the 
program  as  Separate  Compilation  (SC)  procedures  and  the  compiler  than  will  be  able 
to  handle  it. 

5.  ]Vhen  Down  Loading  the  Code 

When  down  loading  the  code,  several  times  a  message  like  the  following 
one  will  occur: 

•  "...  Illegal  board  function"  -  we  had  that  a  lot  with  no  reasonable  cause.  The 
action  taken  when  this  happened  was  to  down  load  again,  sometimes  up  to  4 
times  to  have  the  code  down  loaded  properly  to  the  transputer  network. 

c.   Comments  About  the  Link  Evaluation  Program 

The  Link  Evaluation  Program  takes  about  340  blocks  of  the  VAX,  or 
approximately  170k  of  code  and  comments. 

Our  approach  in  doing  the  Link  Evaluation  Program  was  Top  down  and  we 
think  it  this  was  the  right  one.  First  the  general  structure  was  made,  with  all 
procedures  but  the  user  interface  and  the  terminal  driver  replaced  by  stubs.  When  this 
was  running,  then  one  by  one  the  byte. slice. transfer,  the  inout. transfer,  the 
word. sHce. transfer  and  finally  the  int. transfer  were  added.  Even  though  all  these 
procedures  where  pre-tested  using  dedicated  harnesses,  some  times  new  bugs  came  out 
as  they  were  put  together. 

In  general  the  structure  of  the  program  is  based  on  the  four  procedures  just 
mentioned,  that  reside  one  of  each,  in  each  of  the  transputers.  When  executed,the  the 
user  choice  of  construct  make  the  respective  procedure  be  executed  in  parallel  in  all  5 
transputers. 

/.   Most  Common  Errors 

•  Bad  definition  of  buffer  limits  and  lack  of  initialization. 

•  mismatch  of  channel  usage  -  a  process  outputting  to  a  channel  that  no  other 
process  was  waiting  for  an  input. 

•  compilation  limit  achieved  -  this  happened  in  procedure  hostproc  and  in  order 
not  to  affect  the  performance  measurements,  the  SC  procedure  get. choice  was 
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created  using  part  of  the  user.interface  code,  and  so  procedure  user.interface 
passed  to  call  getchoice. 

The  difficulty  of  finding  the  first  two  problems  is  due  to  the  symptom  to  be 
the  program  freezing  in  execution  on  the  screen  and  no  message  coming.  To  find  were 
the  error  was  occurring  approximately,  we  placed  some  "Screen  !  var"  statements  in  the 
middle  of  the  code,  and  from  then  on  only  reading  the  code  and  guessing  what  it  could 
be,  worked.  We  tried,  and  succeeded,  also  to  trace  the  execution,  by  looking  at  the 
listed  code  and  following  the  flow  of  communications. 

As  a  final  comment,  the  facility  to  reuse  previously  created  software  is 
tremendous.  Each  configuration  just  need  to  be  done  once,  and  can  be  always  reused 
by  just  changing  the  name  of  the  placed  procedures.  The  procedures  and  programs  can 
be  annexed  to  a  new  file  or  filed  with  one  key  stroke,  the  utility  file/unfile  of  the  fold 
editor. 
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APPEiNDIX  E 
THE  LINK  EVALUATION  PROGRAM 


header. occ 


Title    :  Link  Evaluation  Program 

Version   :   7 

Mod      :   0 

Author    :   Jose  Vanni  Filho,  Lcdr.,  Brazilian  Navy 

Date     :   June  /  02  /  1987 

Programming  Language  :  OCCAM  1 

Compiler  :   IMS  D  600  -  TDS 

Purpose   :   To  Evaluate  the  Transputer  link  transfer  rate 
for  several  channel  parallelism  situations, 
construct  types,  and  different  cpu  loads 


Brief  description  of  orogram 

■^  7*c  X  X  7^  K  A  X  X  7^:  X  A  A  5*:  X  ^  *  j'r  7C  ^  7*:  7<:  7<:  A  7c  5*;  *  :*:  T^c  jlc  A  t";  *  7>;  5*:  A  T^  3>:  A  7*r  x  t":  7^ 

Interactive  program  that  uses  the  INMOS  links  at  10  Mbits/sec  and 
evaluates  the  transfer  rates  from  the  bOOl  board  to  the  b003  board 
using  one  to  four  channels  in  parallel  for  output  and  input. 
The  program  calculates  and  display  the  transfer  rate  after  a 
specified  number  of  runs  (20  for  now)  in  a  table  format  for 
the  following  block. size  and  channel  configurations: 


--  Block  Sizes 

1-2-4 
256  -  512  -  1024 

--  Channel  configurations 


8  -    16  -    32  -    64  -    128 
1280  -  2048  -  4096  -  8192  -  10000 


1  channel(output)  in  one  link 

-  2  channels (input  and  output)  in  par  in  one  link 

-  2  channels (output)  in  parallel  in  two  links 

-  4  channels (input  and  output)  in  par  in  two  links 

-  3  channels (output)  in  parallel  in  three  links 

-  6  channels (input  and  output)  in  par  in  three  links 

-  4  channels (output)  in  parallel  in  four  links 

-  4  channels (input)  in  parallel  in  four  links 
in/out  -  8  channels (input  and  output)  in  par  in  four  links 


out 

in/out 

out 

in/out 

out 

in/out 

out 

in 


User  options  during  program  execution 

User  Options : 
•   CPUs  MODES  OF  OPERATION 

0  -  No  concurrent  process 

1  -  B003  cpus  with  sum  pr 

2  -  all  cpus  with  siam  pro 

3  -  B003  cpus  with  sum  pr 

4  -  all  cpus  with  sum  pro 

5  -  B003  cpus  with  array 

6  -  all  cpus  v;ith  array  p 

7  -  B003  cpus  with  array 

8  -  all  cpus  with  array  p 


in  the  cpus 
ocess  concurrently  (par) 
cess  concurrently  (par) 
ocess  concurrently  (pripar) 
cess  concurrently  (pripar) 
product  process  concurrently  (par) 
roduct  process  concurrently  (par) 
product  process  concurrently(pripar 
roduct  process  concurrently  (pripar 


CONSTRUCTS  AND  DATA  TYPES 

A  -  input/output  channels  (  CHARACTERS  (BYTES' 

3  -  byte  slice  input/output  (  CHARACTERS  (BYTES' 

I  -  input/output  channels  (  INTEGERS  (WORDS' 

V7  -  word  slice  input/output  (  INTEGERS  (WORDS ^, 
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--   PROGRAM  link. evaluation 

--  link. evaluation  PROCESSES 

--  TRANSPUTER_ROOTB001.TDS 

--  SC  PROC  hostproc 

--  PROC  hostproc  (CHAN  A,E ,C ,D,E ,F,G,H) 

PROC  hostproc  (CHAN  A,B , C,D,E , F,G,H)  = 

--  description 

This  is  the  outer  procedure  placed  on  transputer  Root.  It  contains 

global  variables  and  constants,  and  all  procedures  that  run  in  this 

transputer.  It  executes  in  parallel  the  procedures  : 

terminal. driver  and  user. interface 

--   global_def . tds  (partial) 
--  Constants  Definitions 
DEF  EndBuffer  =  -3 


assign  the  i/o  port  of  the  BOOl  to  terminal 
set  the  baud. rate  to  9600  bps 


DEF  port  =  0 

DEF  baud  =  11 

constantly  used  ASCII  values 

DEF  tab  =  9 

DEF  If  =  10 

DEF  cr  =  13 

DEF  esc  =  27 

DEF  sp  =  32 

--  Channels  Definitions 

CHAN  Screen    :   defined  for  output  to  the  Screen 

CHAN  Keyboard  :   defined  for  input  from  the  Keyboard 

--  Link  Definitions 
DEF  linkOout  =  0 
DEF  linklout  =  1 
DEF  link2out  =  2 
DEF  link3out  =  3 
DEF  linkOin  =  4 
DEF  linklin  =  5 
DEF  link2in  =  6 
DEF  link3in  =  7 

--  library. occ   (partial) 
--  io_routines .occ 

--  PROC  new. line 
PROC  new. line  = 

—  jumps  to  a  new  line  on  the  screen 

SEQ 

Screen  !  cr; If /EndBuffer  : 

--  PROC  write. string  (VALUE  string[]) 
PROC  write. string  (VALUE  string[])  = 

Writes  a  given  string  to  the  screen,  in  a  byte  by  byte  fashion  * 

SEQ 

SEQ  i  =  [1  FOR  string[BYTE  0]] 

Screen  !  string[BYTE  i] 
Screen  !  EndBuffer  : 

--  PROC  clear. screen 
PROC  clear. screen  = 

clears  the  screen 


SEQ 


Screen  !  esc; 
Screen  !  esc; 

•-  PROC  write. number  (VALUE  number) 


J' ;EndBuffer 


-  clear  screen  sequence 
■-  home  cursor 
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This  PROC  outputs  a  signed  integer  value  to  the  screen   * 

PROC  write. number (VALUE  number)  = 
VAR  output[16],  count,  x: 
SEO 

x:=  number 
count :=  0 
IF 

handle  special  cases 
x=0 

Screen  !  '0' 
x<0 
SEQ 

Screen  !  ' - ' 

X:=-X 

TRUE 
SKIP 

WHILE  x>0 

construct  number 
SEQ 

output[count]  :=  (x   10)  +  '0' 
count  :=  count  +  1 

X:=  x/10 

WHILE  count  >  0 

output  number 
SEQ 

count  :=  count-1 
Screen  !  output[count] 

SKIP: 

--  utilities. occ 

--  PROC  transfer. rate  (VALUE  start , stop, board. type ,nr. of .bytes ... ) 
PROC  transfer. rate  (VALUE  start,  stop,  board. type,  nr. of. bytes, 

VAR  rate)  = 

receives  two  tick  values  "start"  and  "stop",  number  of  bytes  * 

and  board  type  and  outputs  the  transfer  rate.  * 

--  board  number  definitions 

---  board. tyoe  =  0  >  VAX  VMS 

board. type  =  1 >  BOOl 

board. type  =  2 >  B002 

board. type  =  31 >  3003  (  high  priority  ) 


DEF  bOOl.sec  =  625000 
DEF  bOOSh.sec  =  1000000 
DEF  b0031.sec  =   15625 


board. type  =  32 >  B003  (  low  priority  ) 

board,  type  =  4 >  B004 

outputs  to  the  screen  the  transfer  rate  in  kbits  per  second 

constant  definitions 

DEF  vax.sec    =10000000  :      hundreds  of  nsec/second 

#  of  1.6  microsec/second 

#  of  microsec/second 

#  of  64  microsec/second 

DEF  max.number .of . ticks  =  2147483648  :  maximum  integer  (2**31) 

--  variable  declarations 

VAR  elapsed. tick  : 

VAR  factor  :       to  convert  ticks  to  seconds 

SEQ 

elapsed. tick  :=  stop  -  start 
IF 

elapsed. tick  <  0 

elapsed. tick  :=  elapsed. tick  +  max.number .of . ticks 
TRUE 
SKIP 
--  selection  of  correct  factor  in  accordance  with  the  board 
IF 
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board. type  =  0  ---  VAX  VMS 

factor  :=  vax.sec 

board. type  =  1  BOOl 

factor  :=  bOOl .sec 

board. type  =  2  —  B002 

SKIP  to  be  implemented  in  the  future 

board. type  =  31  BOOS  in  high  priority 

factor  :=  b003h.sec 

board. tyoe  =  32  BOOS  in  low  priority 

factor  :=  bOOSl.sec 

board. type  =  4  B004 

SKIP  to  be  implemented  in  the  future 

--  rate  calculation 
IF 

board. type  =  32 

rate  :=  ( (nr .of .bytes*8)*factor)/(elapsed. tick^lOOO) 

operation  is  done  tnis  way  to  keep  precision  ok! 

TRUE 

rate  :=  ( (nr .of .bytes*8)*(factor/1000) )/elapsed. tick 

operation  is  done  in  this  v/ay  in  order  to  don't  exceed 

maxint  on  the  numerator. 

multiply  by  8  due  to  8  bits  per  byte 

divide  by  1000  to  have  the  transfer  rate  in  kbits/sec 

SKIP: 

--  PROC  capitalize  (VAR  ch) 
PROC  capitalize  (VAR  ch)  = 

capitalizes  any  lower  case  character  into  upper  case 


DEF  delta  =('a'  -  'A' ) 


A  --->  65 

a  >  97      ASCII  values 

z  --->  122 


SEQ 
IF 

(ch  <=  'z' )  AND  (ch  >=  'a' ) 

ch  :=  ch  -  delta 
TRUE 

SKIP  : 

•-  SC  PROC  IMS. BOOl.  terminal. driverO 

•-  TERMINAL_DRIVER.TDS 

■-  PROC  IMS. BOOl. terminal. driver (CHAN  Keyboard, Screen, 

VALUE  port, baud. rate) 

—  The  terminal  driver  used  is  the  one  provided  by  the 

—  manufacturer  for  the  board  BOOl,  and  for  that  reason 

—  is  not  included  here. 
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--  SC  PROC  cpubusysum  (CHAN  f lagl ,counterchan)    sum 

--  CPUBUSYSUM. TDS 

--  PROC  cpubusysum  (CHAN  f lagl ,counterchan) 
PROC  cpubusysum  (CHAN  f lagl ,counterchan)= 
--  description 

It  keeps  the  cpu  v/orking  in  parallel(time  sharing)  with  the  link 

transfers  by  doing  sum  operations  .  It  stops  when  it  receives 

a  flag  by  the  channel  flagl  from  the  transfer  procedure  that  is 

being  executed  concurrently.  It  outputs  by  channel  counterchan 

the  number  of  operations  done. 

■k■k^^:■k7^iK-k■k7^7^7^■k■k■k■k^K7;■k■k7^i<■k■k■k■k■k■k■k■k■k■ki^i^i^:■ky^iKi<7^:9^■^ 

VAR  a,b,e, 

working, 
counter, 
ch  : 

SEQ 

counter  :=  0 
working  :=  TRUE 
TIME  ?  a 
WHILE  working 
ALT 

flagl  ?  ch 

working  :=  FALSE 
TIME  ?  b 
SEQ 

e  :=  a  +  b 

counter  :=  counter  +  1 
counterchan  !  counter: 

--  CPUBUSYSUM. dsc   descriptor 

--  CPUBUSYSUM. cde   code 

--  SC  PROC  cpubusyprod  (CHAN  flagl , counterchan)   product 

--  CPUBUSYPROD. TDS 

--  PROC  cpubusyprod  (CHAN  flagl , counterchan) 
PROC  cpubusyprod  (CHAN  flagl , counterchan) = 
--  description 

It  keeps  the  cpu  v;orking  in  parallel(time  sharing)  with  the  link 

transfers  by  doing  array  multiplications.  It  stops  when  receives 

a  flag  by  the  channel  flagl  from  the  transfer  procedure,  that  is 

being  executed  concurrently.  It  outputs  by  channel  counterchan 

the  number  of  operations  done. 

--  constants  and  variable  declarations 

DEF  number  =  100:   size  of  array 

VAR  a[number  +  1],   array  of  integers 


bfnumber  +  1], array  of  integers 

e[number  +  1] ,  array  of  integers 

Clock,  integer  -variable  to  get  time 

working,  boolean  -to  stop  execution 

counter,  integer  -number  of  operations  done 


SEQ 

--  initialize  buffers  and  variables 

SEQ  i  =  [  1  FOR  number  ] 
SEQ 

a[i]  :=  3*i 

b[i]  :=  5*i 
SKIP 

counter  :=  0 
working  :=  TRUE 

WHILE  working 
flagl  ?  ch 
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working  :=  FALSE 
TIME  ?  clock 
SEQ 

SEQ  i  =  [1  FOR  number] 
e[i]  :=  a[i]  *  b[i] 

counter  :=  counter  +  number  updates  nr.  of  operations 

counterchan  !  counter: 

--  global  constant  and  variable  declarations  for  transputer  root 
DEF  sizetable  =  TABLE  [  1,  2,  4,  8,  16,  32,  64,  128,  256,  512, 

1024,  1280,  2048,  4096,  8192   10000  ]: 

DEF  nr. of. sizes  =  16:       as  counted  from  above  table 

DEF  maxblock.size  =  10000:   last  from  the  above  table 

DEF  repetition  =  20:        for  averaging  purposes 

DEF  maxv/ordblock.size  =  maxblock.  size/4  : 

CHAN  hostinO  AT  linkOin; 
CHAN  hostinl  AT  linklin; 
CHAN  hostinZ  AT  link2in 
CHAN  hostinS  AT  link3in 
CHAN  hostoutO  AT  linkOout 
CHAN  hostoutl  AT  linklout 
CHAN  hostout2  AT  link2out 
CHAN  hostoutS  AT  linkSout 
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--  PROC  inout. transfer  (VALUE  repetition, cpumode) 
PROC  inout. transfer  (  VALUE  repetition, cpumode)= 
--  descriotion 

It  initializes  the  buffers  and  it  executes  the  procedure 

iotransfer,   and,  when  applicable  one  of  the  following: 

cpubusy.prod  or  cpubusy . sum.  (according  to  cpumode) 

Uses  global  constant  maxblock. size . 


--   variable  declarations 

CHAN  flag,     flags  the  cpu  to  stop 

counter:  return  the  number  of  operations  cpu  did 

VAR  bufferO  [BYTE  maxblock. size  +  1] 


bufferl 
buf fer2 
buffers 


BYTE  maxblock. size  +  1 

'BYTE  maxblock. size  +  1 

BYTE  maxblock. size  +  1 


--  PROC  iotransfer  (VALUE  repetition,  cpumode,  CHAN  flag,  ...) 
PROC  iotransfer  (VALUE  repetition, cpumode , CHAN  flag,  counter)= 
--  Description 

Executes  sequentially  several  parallel  transfers  of  bytes 

to/from  one  to  four  transouters  using  the  input/output 

primitive  and  output  to  the  screen  the  transfer  rate 

values  of  the  output  TABLE. 

Uses  global  constants  :  sizetable,  nr. of. sizes,  repetition 


--   variable  declarations 
VAR  block. size , 

actual. rate , 

rate, 

number, 

ch[4] , 

deadtime 

timeOr4 

timel [4 


the  number  of  operations  cpu  did 

deadtimeO,  deadtimel,  to  calculate  deadtime 


SEQ 


SEQ  i  =  [0  FOR  nr. of. sizes] 
SEQ 

--  making  the  table 

block. size  :=  sizetable[i] 

write. number  (block. size) 

Screen  !  tab 

--  calculation  of  deadtime 

TIME  ?  deadtimeO 

SEQ  i  =  [1  FOR  block. size] 

SKIP 
TINE  ?  deadtimel 
deadtime  :=  deadtimel  -  deadtimeO 

--  output  to  one  channel 
actual. rate  :=  0 
SEQ  j  =  [1  FOR  repetition] 
SEQ 


hostinO  ?  ch 
TIME  ?  timeO 


block. size] 


SEQ  k  =  [1  FOR 

hostoutO  !  bufferO  [BYTE  k] 
TIME  ?  timel [0] 

timel [0]  :=  timel [0]  -  deadtime 
^     ■   '■  ■  iOl ^ 


transfer . rate  ( timeO [0] , timel [0] , 1 , block. size , rate) 
actual. rate  :=  ( (actual. rate  *  (j-1))  +  rate)/j 


SKIP 

write. number  (actual. rate) 
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Screen  !  tab 

--  output/input  from  one  channel 
actual. rate  :=  0 
SEQ  j  =  [1  FOR  repetition] 
SEO 


hostinO  ?  ch 
TIME  ?  timeO 


size, rate) 
rate)/j 


SEQ  k  =  [1  FOR  block. size] 
PAR 

hostoutO  !  bufferO  [BYTE  k] 
hostinO  ?  bufferl  [BYTE  k] 
TIME  ?  timel[0] 

timeiro]  :=  timel[Ol  -  deadtime 

transrer.rate( timeO[0] , timel [0] ,1 ,block. size .rate) 
actual. rate  :=  ( (actual. rate  *  (j-1))  +  rate)/j 
SKIP 

v;rite. number  (actual,  rate) 
Screen  !  tab 

--  output  to  two  channels 
actual. rate  :=  0 
SEQ  j  =  [1  FOR  repetition] 
SEQ 
PAR 

hostinO  ?  ch[0 
hostinl  ?  ch[l 
TIME  ?  timeO[0] 
SEQ  k  =  [1  FOR  block. size] 
PAR 

hostoutO  !  bufferO  [BYTE  k 
hostoutl  !  bufferl  [BYTE  k 
TIME  ?  timel [0] 

timel [0]  :=  timel [01  -  deadtime 
transfer. rate(timeO [0]  timel [0] , 1 , block, 
actual. rate  :=  ( (actual. rate  '^  (j-1))  + 
SKIP 

write. number  (actual. rate) 
Screen  !  tab 

--  output/ input  from  two  channels 
actual,  rate  -.=  0 
SEQ  j  =  [1  FOR  repetition] 
SEQ 
PAR 

hostinO  ?  ch[0 
hostinl  ?  ch[l 
TIME  ?  timeO[0] 
SEQ  k  =  [  1  FOR  block. size  ] 
PAR 

hostoutO  !  bufferO  [BYTE  k 
hostoutl  !  bufferl  [BYTE  k' 
hostinO  ?  buffer2  [BYTE  k 
hostinl  ?  buffers  [BYTE  k 
TIME  ?  timel [0] 
timel [0]  :=  timel [01  -  deadtime 

transfer. rate(timeO[0] .timel [0] ,1 ,block.size . rate) 
actual. rate  :=  ( (actual. rate  *  (j-1))  +  rate)/j 
SKIP 

v;rite. number  (actual. rate) 
Screen  !  tab 

--  output  to  three  channels 
actual. rate  :=  0 
SEQ  j  =  [1  FOR  repetition] 
SEQ 
>AR 

hostinO  ?  ch[0 
hostinl  ?  ch'l 
hostin2  ?  ch[2 
TIME  ?  time0[0] 
SEQ  k  =  [1  FOR  block. size] 
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PAR 

hostoutO  !  bufferO  [BYTE  k 

hostoutl  !  bufferl  'BYTE  k 

hostout2  !  buffer2  [BYTE  k 

TIME  ?  timel[0] 

timeiro]  :=  timel[Ol  -  deadtime 

transfer.rate( timeO[0] . timel [0] ,1 ,block.size . rate) 
actual. rate  :=  ( (actual. rate  *  (j-1))  +  rate)/i 
SKIP 

v;rite  .number  (actual,  rate) 
Screen  !  tab 

--  output/input  from  three  channels 
actual. rate  :=  0 
SEO  j  =  [1  FOR  repetition] 
SEQ 
PAR 

hostinO  ?  ch[0 
hostinl  ?  ch '1 
hostin2  ?  ch[2 
TIME  ?  timeO[0] 
SEQ  k  =  [  1  FOR  block. size  ] 
PAR 

hostoutO  !  bufferO  [BYTE  kl 
hostoutl  !  bufferl  'BYTE  k 
hostout2  !  buffer2  [BYTE  k 
hostinO  ?  bufferO  [BYTE  k 
hostinl  ?  bufferl  'BYTE  k 
hostin2  ?  buffer2  [BYTE  k 
TIME  ?  timel [0] 
timel  f_0]  :=  timel  [01  -  deadtime 

transfer. rate (time0[0] . timel [0] ,1 ,block.size , rate) 
actual. rate  :=  ( (actual. rate  *"   (j-1))  +  rate)/j 
SKIP 

write. number  (actual. rate) 
Screen  !  tab 

--  output  to  four  channels 
actual. rate  :=  0 
SEQ  j  =  [1  FOR  repetition] 
SEQ 
PAR 

hostinO  ?  ch[0 
hostinl  ?  ch'l 
hostin2  ?  ch'2 
hostinS  ?  ch[3 
TIME  ?  timeO[0] 
SEQ  k  =  [1  FOR  block. size] 
PAR 

hostoutO  !  bufferO  [BYTE  k 
hostoutl  !  bufferl  'BYTE  k 
hQStout2  !  buffer2  'BYTE  k' 
hostoutS  !  buffers  [BYTE  k 
TIME  ?  timel [0] 
timel  [0]  :=  timel [01  -  deadtime 

transter.rate(timeO[0] , timel [0] , 1 ,block. size . rate) 
actual. rate  :=  ( (actual,  rate  *^   (j-l))  +  rate)/j 
SKIP 

v/rite. number  (actual. rate) 
Screen  !  tab 


--  input  from  four  c 

•hannels 

actual. rate  :=  0 

SEQ  j  =  [1  FOR  repetition] 

SEQ 

PAR 

hostinO  ?  ch 

O' 

hostinl  ?  ch 

'1' 

hostin2  ?  ch 

'2 

hostin3  ?  ch 

■3- 

TIME  ?  timeO[0] 
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SEQ  k  =  [  1  FOR  block. size  ] 
PAR 

hostinO  ?  bufferO  [BYTE 
hostinl  ?  bufferl  'BYTE 
hostin2  ?  buffer2  'BYTE 
hostinS  ?  buffers  [BYTE 

TIME  ?  timel[0" 

timel[Ol  :=  timel[0 


ilFC 

isf€ 


-  deadtime 


transfer. rate(timeO[0]  timel [0] , 1 , block, 
actual. rate  :=  ( (actual. rate  ^    (j-1))  + 

SKIP 

write .number  (actual. rate) 

Screen  !  tab 


size , rate) 
rate)/j 


ch 
ch 
ch 
ch 


--  all  output 
actual. rate  := 
SEQ  j  =  [1  FOR 
SEQ 
PAR 

hostinO 
hostinl 
hostin2 
hostinS 
TIME  ? 
PAR 

SEQ  k  = 
PAR 

hostoutO 
hostoutl 
hostoutZ 
hostoutS 
SEQ  k  =  [1  FOR 
PAR 

hostinO  ? 
hostinl  ? 
hostin2  ? 
hostinS  ? 
timel [0 


and  input  in  parallel 
0 
repetition] 


timeO[0 

[1  FOR  block. size] 


bufferO 
bufferl 
buffer2 
buffers 


block. size] 


BYTE 
BYTE 
BYTE 
BYTE 


bufferO 
bufferl 
buffer2 
buffers 


BYTE 
BYTE 
BYTE 
BYTE 


TIME 

timel [0]  :=  timel [01   

transfer .rate(timeO[0] ,timel[0] , 1 ,block.size 


actual. rate 
SKIP 
v;rite  .number 

new. line 


deadtime 

_, ^-j, timely  _,  , 

:=  ((actual. rate  ^  (j-1)) 

(actual. rate) 


+   rate)/j 


SKIP 

nev;.line 

--  send  to  screen  operations  done  concurrently 

IF 

cpumode  =  ' 0 ' 

write. string  ("  No  other  process 
( ( ( cpumode= ' 2 ' ) OR ( cpumode= ' 4 ' ) )  OR 

( ( cpumoae= ' 5 ' ) OR ( cpumode= ' 8 
SEQ 

flag  !  'a' 

counter  ?  number 

write. string  ("Number  of  operations  (in  //)  at  ") 

v;rite  .string  ("the  bOOl  transputer 

v/rite  .number  (number) 

new. line 

hostinO  ?  number 


running  concurrently 
))) 


) 


"Number  of  operations 
'transputer  0  (bOOS)  ' 


write. string 
write. string 
write. number  (number; 
TRUE 
SEO 

nostinO  ?  number 
write. string  ("Number  of  o 
write. string  ("transputer 
v/rite  .number  (number) 


(in  //)  at  ") 


)erations 
)(bOOS)") 


(in  //)  ") 
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new. line 
new. line 


SEQ   main  inout. transfer 

--  initializing  buffers 
SEQ  k  =  [1  FOR  maxblock.size] 
SEQ 


SKIP 

IF 


bufferO 
bufferl 
buffer2 
buffers 


BYTE  k 
BYTE  k 
BYTE  k 
BYTE  k 


'0' 
'1' 
'2' 
•3' 


cpumode  =  ' 2  ' 
PAR 

iotransfer  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
cpumode  =  ' 4 ' 
PRI  PAR 

iotransfer  (repetition,  cpumode,  flag,  counter) 
CDubusysum  (flag,  counter; 
cpumode  =  ''6  ' 
PAR 

iotransfer  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
CDumod.e  =  '*8' 
PRI  PAR 

iotransfer  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
TRUE 

iotransfer  (repetition,  cpumode,  flag,  counter): 
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--  PROC  byte. slice. transfer  (VALUE  repetition,  cpumode) 
PROC  byte .slice . transfer  (VALUE  repetition,  cpumode)= 
--  description 

It  initializes  the  buffers  and  it  executes  the  procedure 

transfer,   and,  v;hen  applicable  one  of  the  following: 

—     cpubusy.prod  or  cpubusy .sum.  (according  to  cpumode) 
Uses  global  constant  maxblock.size. 


--   variable  declarations 

CHAN  flag,    flags  the  cpu  to  stop 

counter:  return  the  number  of  operations  cpu  did 

VAR  bufferO  [BYTE  maxblock.size  +  1] 
^BYTE  maxblock.size  +  1 


bufferl 
buffer2 
buffers 


BYTE  maxblock.size  +  1 
BYTE  maxblock.size  +  1 


--  PROC  transfer  (VALUE  repetition,  cpumode,  CHAN  flag,  counter) 
PROC  transfer  (VALUE  repetition, cpumode , CHAN  flag,  counter)= 
--  Description 

Executes  sequentially  several  parallel  transfers  of  bytes 

to/from  1  to  four  transputers  using  the  BYTE. SLICE  Procedure 

and  output  to  the  screen  the  transter  rate  values  of  the 

output  TABLE. 


Uses  global  constants  :  sizetable,  nr. of. sizes,  repetition 


^i^-k 


--   variable  declarations 
VAR  block. size, 

actual. rate , 

rate , 


numoer , 
ch[4l 
timeOI 4 
timel 


the  number  of  operations  cpu  did 


SEQ 


SEQ  i  =  [0  FOR  nr. of. sizes] 
SEQ 

--  making  the  table  after  each  io  operation 
block. size  :=  sizetable[i] 
write. number  (block. size) 
Screen  !  tab 

--  output  to  one  channel 
actual. rate  :=  0 
SEQ  j  =  [1  FOR  repetition] 
SEQ 


TiostinO  ?  ch 
TIME  ?  timeO 


BYTE . SLICE .OUTPUT (hostoutO , buff erO , 1 , block,  size) 

TIME  ?  timel [01 

transfer. rate  (timeO[0] , timel [0] ,1 , block. size, rate) 

actual. rate  :=  ( (actual. rate  *  (j-1))  +  rate)/j 
SKIP 

write. number  (actual. rate) 
Screen  !  tab 

--  output/input  to  one  channel 
actual. rate  :=  0 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

hostinO  ?  ch[0 
TIME  ?  timeO[0 
PAR 

BYTE .SLICE .OUTPUT (hostoutO , buff erO , 1 , block. size) 
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BYTE . SLICE . INPUT (hostinO , buff erl , 1 , block. size ) 
TIME  ?  timel[0; 


(timeO[0] , timel [01,1 ,block.size , r. 
((actual. rate  *   (j-1))  +  rate)/j 


transfer. rate  (timeO[0] , timel [0] ,l,block. size , rate) 

actual. rate  :=  ((actua. 
SKIP 

write. number  (actual. rate) 
Screen  !  tab 

--  output  to  two  channels 
actual. rate  :=  0 
SEQ  j  =  [1  FOR  repetition] 
SEQ 
PAR 

hostinO  ?  ch[0' 
hostinl  ?  ch[l" 
TIME  ?  timeO[0]  ' 
PAR 

BYTE. SLICE. OUTPUT (hostoutO,bufferO,l, block. size 
BYTE . SLICE .OUTPUT (hostoutl , buff erl , 1 , block. size 
TIME  ?  timel [0] 
transfer .rate (timeO[0]  timel [0] ,1 ,block.size . rate) 
actual. rate  :=  ( (actual. rate  *  (j-1))  +  rate)/j 
SKIP 

v;rite  .number  (actual,  rate) 
Screen  !  tab 

--  output/input  from  two  channels 
actual. rate  :=  0 
SEQ  j  =  [1  FOR  repetition] 
SEQ 
PAR 

hostinO  ?  chfO 
hostinl  ?  ch[l 
TIME  ?  timeO[0] 
PAR 

BYTE . SLICE .OUTPUT (hostoutO , buff erO , 1 , block .size 
BYTE. SLICE. OUTPUT (hostoutl, buff erl, 1, block. size 
BYTE . SLICE . INPUT(hostinO ,buf f er2 , 1 , block. size ' 
BYTE . SLICE . INPUT (hostinl ,buf f er3 , 1 , block. size 
TIME  ?  timel [0] 
transfer. rate(timeO[0] , timel [0] , 1 , block. size . rate) 
actual. rate  :=  ( (actual,  rate  '^   (j-1))  +  rate)/j 
SKIP 

write. number  (actual. rate) 
Screen  !  tab 


--  output  to  three  c 

hannels 

actual. rate  :=  0 

SEQ  j  =  [1  FOR  repetition] 

SEQ 

PAR 

hostinO  ?  ch 

[0] 

hostinl  ?  ch 

'1' 

hostinZ  ?  ch 

2' 

TIME  ?  time0[0] 
PAR 

BYTE . SLICE .OUTPUT (hostoutO , buff erO , 1 , block. size) 
BYTE. SLICE. OUTPUT (hostoutl, buff erl, 1, block. size) 
BYTE .SLICE . OUTPUT (hostout2 , buff er2 , 1 , block . size ) 
TIME  ?  timel [0] 

transfer .rate(timeO[0] ,timel[0] , 1 ,block. size , rate) 
actual. rate  :=  ( (actual. rate  *  (j-1))  +  rate)/j 
SKIP 

write. number  (actual. rate) 
Screen  !  tab 

--  output/input  from  three  channels 
actual. rate  :=  0 
SEQ  j  =  [1  FOR  repetition] 
SEQ 
PAR 

hostinO  ?  ch[0] 
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hostinl  ?  ch[l 
hostin2  ?  ch[2 
TIME  ?  tiraeO[0] 
PAR 

BYTE. SLICE. OUTPUT (hostoutO,bufferO,l, block. size) 
BYTE. SLICE. OUTPUT (hostoutl,bufferl,l, block. size) 
BYTE.SLICE.0UTPUT(hostout2,buffer2,l, block. size) 
BYTE . SLICE . INPUT (hos tinO , buff erO , 1 .block. size) 
BYTE. SLICE. INPUT (hostinl,bufferl,l, block. size 
BYTE .SLICE . INPUT (hos tin2 , buff er2 , 1 , block. size ) 
TIME  ?  timel[0] 

transfer .rate (timeO[0] , timel [0] , 1 , block. size . rate) 
actual. rate  :=  ( (actual. rate  '^  (j-1))  +  rate)/j 
SKIP 

write. number  (actual. rate) 
Screen  !  tab 


--  output  to  four  channels 

actual. rate  :=  0 

SEQ  j  =  [1  FOR  repetition] 

SEQ 

PAR 

hostinO  ?  ch 

'0' 

hostinl  ?  ch 

'1' 

hostin2  ?  ch 

'2' 

hostin3  ?  ch 

'3' 

size, 
size^ 
size' 
size' 

rate) 


TIME  ?  timeO[0] 
PAR 

BYTE . SLICE . OUTPUT(hostoutO ,buf f erO , 1 , block 
BYTE. SLICE. OUTPUT (hostoutl,bufferl,l, block 
BYTE. SLICE. OUTPUT (hostout2,buffer2,l, block 
BYTE . SLICE . OUTPUT (hostoutS , buff er3 , 1 , block 
TIME  ?  timel [0] 

transfer.  rate(timeO[0]  ,  timel  [0]  ,  1  ,block.size  . ,.  « 
actual. rate  :=  ( (actual. rate  *  (j-l))  +  rate)/j 
SKIP 

write. number  (actual. rate) 
Screen  !  tab 

--  input  from  four  channels 
actual. rate  :=  0 
SEQ  j  =  [1  FOR  repetition] 
SEQ 
PAR 

hostinO  ?  ch[0 
hostinl  ?  ch'l 
hostin2  ?  ch'2 
hostinS  ?  ch[3 
TIME  ?  timeO[0] 
PAR 

BYTE . SLICE . INPUT (hostinO , buff erO , 1 , block. size 
BYTE . SLICE . INPUT (hostinl ,buf f erl , 1 , block . size ' 
BYTE . SLICE . INPUT ( hos tin2 , buff er2 , 1 , block. size' 
BYTE . SLICE . INPUT (hostinS , buff er3 , 1 , block . size ' 
TIME  ?  timel [0] 
transfer. rate(timeO[0] , timel [0] , 1 ,block.size . rate) 
actual. rate  :=  ( (actual,  rate  '^    (j-1))  +  rate)/j 
SKIP 

write. number  (actual. rate) 
Screen  !  tab 

--  all  output  and  input  in  parallel 
actual. rate  :=  0 
SEQ  j  =  [1  FOR  repetition] 
SEQ 


PAR 

hostinO 

•? 

ch 

0 

hostinl 

? 

ch 

1 

hostin2 

? 

ch 

2 

hostin3 

7 

ch 

3 

TIME  ?  timeO[0] 
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PAR 
BYTE 
BYTE 
BYTE 
BYTE 
BYTE 
BYTE 
BYTE 
BYTE 

TIME  ? 


SLICE. OUTPUT (hostoutO,bufferO,l, block. size' 
SLICE. OUTPUT (hostoutl,bufferl,l, block. size' 
,  SLICE. OUTPUT (hostout2,buffer2,l, block. size' 
-SLICE. OUTPUT (hos touts, buffers,:, block. size' 
SLI CE. INPUT (hostinO,bufferO,l, block. size) 
,  SLICE. INPUT (hostinl, buffer 1,1, block. size) 
,  SLICE. INPUT (hostin2,buffer2,l, block. size) 
,  SLICE. INPUT (hostinS, buffers,:, block. size) 
timel[0] 


transfer. rate(timeO[0]  timel [0] , 1 ,block.size . rate) 
actual. rate  :=  ( (actual. rate  *  (j-l))  +  rate)/j 

SKIP 

write. number  (actual. rate) 

new. line 

SKIP 

new. line 

--  send  to  screen  operations  done  concurrently 

IF 

cpumode  =  ' 0 ' 

write. string  ("  No  other  process  running  concurrently  ") 
( ( (cpumode= ' 2 ' )OR(cpumode= ' 4 ' ) )0R 

( (cpumode= ' 6 ' )0R(cpumode= ' 8 ' ) ) ) 
SEQ 

flag  !  'a' 
counter  ?  number 

write. string  ("Number  of  operations  (in  //)  at  the  ") 
v/rite  .  string  ("bOOl  transputer  ") 
write. number  (number) 
new. line 

hostinO  ?  number 

write. string  ("Number  of  operations  (in  //)  at  ") 
write. string  ("transputer  0  (bOOS)  ") 
write. number  (number) 
TRUE 
SEQ 

nostinO  ?  number 

write. string  ("Number  of  operations  (in  //)  at  ") 
write. string  ("transputer  0  (bOOS)  ") 
write. number  (number) 
new. line 
new. line    : 


SEQ  main  byte. slice. transfer 

--  initializing  buffers 
SEQ  k  =  [1  FOR  maxblock.size] 
SEQ 

•^     --     -    ^BYTE  kl 
'BYTE  k 
"BYTE  k 
'BYTE  k 


'0' 
'1' 
'2' 
'3' 


bufferO 
buf ferl 
buffer2 
buffers 
SKIP 

IF 

cpumode  =  ' 2  ' 
PAR 

transfer  (repetition,  cpumode,  flag,  counter) 
CDubusysum  (flag,  counter) 
cpumocfe  =  '  4  ' 
PRI  PAR 

transfer  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
cpumode  =  '  6  ' 
PAR 

transfer  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
cpumod!'e  =  '3' 
PRI  PAR 

transfer  (repetition,  cpumode,  flag,  counter) 
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cpubusyprod  (flag,  counter) 
TRUE 

transfer  (repetition,  cpumode,  flag,  counter) 
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--  PROC  int. transfer  (VALUE  repetition, cpumode) 
PROC  int. transfer  (  VALUE  repetition, cpumode)= 
--  description 

It  initializes  the  buffers  and  it  executes  the  procedure 

intransfer,   and,  when  applicable  one  of  the  follov/ing: 

cpubusy.prod  or  cpubusy.sum.  (according  to  cpumode) 

Uses  global  constant  maxblock. size . 


--   variable  declarations 

CHAN  flag,    flags  the  cpu  to  stop 

counter:  return  the  number  of  operations  cpu  did 

VAR  wbufferO  [maxwordblock. size  +  1] 


wbufferl 
wbuffer2 
wbufferS 


maxwordblock. size  +  1 
maxwordblock. size  +  1 
maxv;ordblock.  size  +  1 


/ 


--  PROC  intransfer  (VALUE 
PROC  intransfer  (VALUE  rep 
--  description 


repetition,  cpumode,  CHAN  flag,  counter) 
etition, cpumode, CHAN  flag,  counter)= 


■ky<y^7^'7^■k■ki^y^;■k■kyr^k7<■k■k■ki^i^■k■k■k^k■k7^i^:■ki^■k■ki^;i^■kiri<■k■k■ky^■k■^ 

several  parallel  transfers  of  integers 
ransputers  using  input/output  primitives 
een  the  transfer  rate  values  of  the 


Executes  sequentially 
to/from  one  to  four  t 
and  output  to  the  scr 
output  TABLE. 
Uses  global  constants 

7K■k■k■k^ky<:7:^ky^■k:k■k:k7:^k^k:k■k^k:k:k 


.    sizetable,  nr. of. sizes,  repetition 


--   variable  declarations 
VAR  block. size, 

actual. rate , 

rate , 

number, 

ch[4], 

deadtime,  deadtimeO,  deadtimel,  to  calculate  deadtime 


the  number  of  operations  cpu  did 


timeO 
timel 


SEQ 


SEO  i  =  [0  FOR  nr. of. sizes] 
SEQ 

--  making  the  table 
block. size  :=  sizetable[i] 
write. number  (block. size) 
Screen  !  tab 
IF 

block. size  <  4 

write. stringf "minimum  transfer  for  integers  ") 
write .string("is  4  bytes(word)") 
TRUE 
SEQ 

--  calculation  of  deadtime 

TIME  ?  deadtimeO 

SEQ  i  =  [1  FOR  (block. size/4)] 

SKIP 
TIME  ?  deadtimel 
deadtime  :=  deadtimel  -  deadtimeO 

--  io  handling 
--  output  to  one  channel 
actual. rate  :=  0 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

hostinO  ?  ch[0' 
TIME  ?  timeO[0' 

SEQ  k  =  [1  FOR  (block. size/4)] 
hostoutO  !  wbufferO [k] 
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TIME  ?  timel[0] 

timeiro]  :=  timel[0]  -  deadtime 

transfer .rate  (timeO-0-   timel- 

block. size . 

actual,  rate  .-=  ( (actual,  rate  ^ 
SKIP 

write .number  (actual. rate) 
Screen  !  tab 

--  output/input  from  one  channel 
actual. rate  :=  0 
SEQ  j  =  [1  FOR  repetition] 
SEQ 


0-,  1, 

rate) 
(J-1)) 


+  rate)/j 


hostinO  ?  ch 
TIME  ?  timeO 


SEQ  k  =  [1  FOR  (block. size/4)] 

PAR 

hostoutO  !  wbufferO  [k] 
hostinO  ?  wbufferl  [k] 
TIME  ?  timel[0] 

timeiro]  :=  timel[0]  -  deadtime 

transfer .rate(timeO[0] ,timel[0] , 1 ,block. size , rate) 
actual. rate  :=  ( (actual. rate  *  (j-1))  +  rate)/j 
SKIP 

write. number  (actual. rate) 
Screen  !  tab 

--  output  to  two  channels 
actual',  rate  :=  0 
SEQ  j  =  [1  FOR  repetition] 
SEQ 
PAR 

hostinO  ?  ch[0 
hostinl  ?  ch[l 
TIME  ?  timeO[0] 

SEQ  k  =  [1  FOR  (block. size/4)] 
PAR 


hostoutO  !  wbufferO 
hostoutl  !  wbufferl 
TIME  ?  timel[0] 
timeiro]  :=  timel[0]  -  deadtime 

transfer .rate (time0[0] , timel [0] , 1 ,block.size . rate) 
actual. rate  :=  ( (actual. rate  *   (j~l))  +  rate)/j 
SKIP 

write. number  (actual. rate) 
Screen  !  tab 

--  output/input  from  two  channels 
actual. rate  :=  0 
SEQ  j  =  [1  FOR  repetition] 
SEQ 
PAR 

hostinO  ?  chro. 
hostinl  ?  ch[l' 
TIME  ?  timeO[0] 

SEQ  k  =  [  1  FOR  (block. size/4)  ] 
PAR 

hostoutO  !  wbufferO 
hostoutl  !  v;bufferl 
hostinO  ?  wbuffer2 
hostinl  ?  wbufferS 
TIME  ?  timel [0] 
timeiro]  :=  timel [01  -  deadtime 

transfer.rate(timeO[0] , timel [0] ,1 ,block. size .rate) 
actual. rate  :=  ( (actual. rate  '^  (j-1))  +  rate)/j 
SKIP 

write. number  (actual. rate) 
Screen  !  tab 

--  output  to  three  channels 

actual. rate  :=  0 

SEQ  j  =  [1  FOR  repetition] 
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5EQ 


PAR 

hostinO  ?  ch[0 

hostinl  ?  ch'l 

hostin2  ?  ch[2 
TIME  ?  timeO[Ol 
--  output  handling 
SEQ  k  =  [1  FOR  (block. size/4)] 

PAR 


hostoutO  !  v/bufferO 
hostoutl  !  wbufferl 
hostoutZ  !  wbuffer2 


TIME  ?  timel[0] 

timeiro]  :=  timel[Ol  -  deadtime 

transfer. rate(timeO[0] , timel [0] ,1 ,block. size .rate) 

actual. rate  :=  ( (actual. rate  *  vj"l))  +  rate)/j 
SKIP 

write. number  (actual. rate) 
Screen  !  tab 

--  output/input  from  three  channels 
actual. rate  :=  0 
SEQ  j  =  [1  FOR  repetition] 
SEQ 
PAR 

hostinO  ?  ch[0' 
hostinl  ?  ch '1' 
hostin2  ?  ch[2' 
TIME  ?  timeO[0] 
--  output/input  handling 
SEQ  k  =  [  1  FOR  (block. size/4)  ] 
PAR 

hostoutO  !  wbufferO  [k 
hostoutl  !  wbufferl  'k 
hostout2  !  wbuffer2  |k 
hostinO  ?  wbufferO 
hostinl  ?  wbufferl 
hostin2  ?  wbuffer2 

TIME  ?  timel [0] 

timel [0] 

transfer. rate(timeOtO] , timel [0] ,1 ,block. size .rate) 

actual. rate  :=  ( (actual. rate  *  (j-1))  +  rate)/j 
SKIP 

write. number  (actual. rate) 
Screen  !  tab 


;=  timel [0]  -  deadtime 
'  ■   it' 


--  output  to  four  channels 

actual. rate  :=  0 

SEQ  j  =  [1  FOR  repetition] 
SEQ 

■pAR 

hostinO  ?  ch 

'0' 

hostinl  ?  ch 

'l' 

hostin2  ?  ch 

'2' 

hostinS  ?  ch 

■3' 

TIME  ?  time0[0] 
--  input  and  output  handling 
SEQ  k  =  [1  FOR  (block. size/4)] 
PAR 

hostoutO  !  wbufferO  [k 

hostoutl  !  wbufferl  'k 

hostout2  !  wbuffer2  'k 

hostoutS  !  wbuffer3  [k 

TIME  ?  timel [0] 

timel  [0]  :=  timel [0]  -  deadtime 

transfer.rate(timeO[0] , timel [0] , 1 ,block. size .rate) 

actual. rate  :=  ( (actual. rate  *    (j-1))  +  rate)/j 
SKIP 
write. number  (actual. rate) 
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Screen  !  tab 

--  input  from  four  channels 
actual. rate  :=  0 
SEQ  j  =  [1  FOR  repetition] 
SEQ 
PAR 

hostinO  ?  ch[0 
hostinl  ?  ch'l 
hostin2  ?  ch  '2 
hostinS  ?  ch[3 
TIME  ?  timeOroj 
--  input  handling 
SEQ  k  =  [  1  FOR  (block. size/4)  ] 
PAR 

7 


hostinO 
hostinl 
hostin2 
hostinS 


wbuf ferO 
wbufferl 
wbuffer2 
wbuffer3 


TIME  ?  timel[0] 

timelfO]  :=  timel[Ol  -  deadtime 
transfer. rate( timeO[0]  timelfo] ,1, block, 
actual. rate  :=  ( (actual. rate  *  (j-1))  + 

SKIP 

write. number  (actual. rate) 

Screen  !  tab 

--  all  output  and  input  in  parallel 

actual. rate  :=  0 

SEQ  j  =  [1  FOR  repetition] 

SEQ 


size, rate) 
rate)/j 


PAR 

hostinO 
hostinl 
hostin2 
hostinS 


ch 
ch 

ch 
ch 


TIME  ?  timeO[0] 
--  input  and  output  handling 
SEQ  k  =  [1  FOR  (block. size/4)] 
PAR 


hostoutO 
hostoutl 
hostout2 
hostoutS 
hostinO  ? 
hostinl  ? 
hostin2  ? 
hostinS  ? 

TIME  ?  timel[0] 


wbufferO 

v;buf  ferl 
wbuffer2 
wbuffer3 
wbufferO 
wbufferl 
wbuffer2 
wbufferS 


timelfO]  :=  timel[Ol  -  deadtime 

transfer .rate (timeO[0] , timel [0] , 1 , block. size . rate) 

actual. rate  :=  ( (actual,  rate  '^   (j-1))  +  rate)/j 

SKIP 

write. number  (actual. rate) 

new. line 

SKIP 

new. line 

--  send  to  screen  operations  done  concurrently 

IF 

cpumode  =  ' 0 ' 

write. string  ("  No  other  process  running  concurrently  ") 
( ( ( cpumode= ' 2 ' ) OR ( cpumode= ' 4 ' ) ) OR 

( (cpumode= ' 6 ' )OR(cpumode= ' 8 ' ) ) ) 
SEQ 

flag  !  'a' 

counter  ?  number 

write. string  ("Number  of  operations  (in  //)  at  the  ") 

write. string  ("bOOl  transputer  ") 

v/rite. number    (number) 
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new. line 

hostinO  ?  number 

write. string  ("Number  of  operations  (in  //)  at  ") 

write. string  ("transputer  0  (b003)  ") 

write. number  (number) 
TRUE 
SEQ 

nostinO  ?  number 

write. string  ("Number  of  operations  (in  //)  at  ") 

write. string  ("transputer  0  (b003)  ") 

write. number  (number; 
new. line 
new. line    : 


SEQ  main  int. transfer 

--  buffers   initialization 
SEQ  k  =  [1  FOR  maxwordblock.size] 
SEQ 

wbufferO  [k]  :=  10000 
wbufferl  'k'  :=  20000 
wbuffer2  [k  :=  30000 
wbuffer3  'k'  :=  40000 
SKIP 

IF 

cpumode  =  ' 2 ' 
PAR 

intransfer  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
cpumod!'e  =  '  4 ' 
PRI  PAR 

intransfer  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
cpumode  =  '  6  ' 
PAR 

intransfer  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
cpumode  =  '8' 
PRI  PAR 

intransfer  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
TRUE 

intransfer  (repetition,  cpumode,  flag,  counter): 
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--  PROC  word. slice. transfer  (VALUE  repetition,  cpumode) 
PROC  v;ord.  slice  .  transfer  (VALUE  repetition,  cpumode)  = 
--  description 

It  initializes  the  buffers  and  it  executes  the  procedure 

wordtransfer ,   and,  when  applicable  one  of  the  following: 

cpubusy.prod  or  cpubusy.sum.  (according  to  cpumode) 

Uses  global  constant  maxblock.size . 


--  variable  declarations 

CHAN  flag,    flags  the  cpu  to  stop 

counter:  return  the  number  of  operations  cpu  did 

VAR  wbufferO  [maxwordblock. size  +  1] 
wbufferl  [maxwordblock. size  +  1' 
wbuffer2  'maxwordblock. size  +  1' 
wbufferB  'maxwordblock, size  +  1' 


--  PROC  wordtransfer  (VALUE  repetition,  cpumode,  CHAN  flag,  ...) 
PROC  wordtransfer  (VALUE  repetition, cpumode , CHAN  flag,  counter )= 
--  description 

Executes  sequentially  several  parallel  transfers  of  integers 

to/from  one  to  four  transputers  using  the  WORD. SLICE  Procedure 

and  output  to  the  screen  the  transfer  rate  values  of  the 

output  TABLE. 

Uses  global  constants  :  sizetable,  nr. of. sizes,  repetition 


--   variable  declarations 

VAR  block. size,      number  of  bytes  to  be  transmited 

actual. rate,     updated  rate  and  final  rate 

rate,  auxiliary  variable  to  hold  temporary  rate 

number,  the  number  of  operations  cpu  did 

ch[4] 
'   !0 


time 
timel 


^4>, 


SEQ 


SEQ  i  =  [0  FOR  nr. of. sizes] 
SEQ 

--  making  the  table  after  each  io  operation 

block. size  :=  sizetable[i] 

write. number  (block. size) 

Screen  !  tab 

IF 

block. size  <  4 

write . string( "minimum  transfer  for  integers  ") 
write. string("is  4  bytes(word)" ) 
TRUE 


ATTENTION  !  The  code  is  shifted  left  12  spaces  from  here  on, 
due  to  printing  requirements. 


SEQ 


--  output  to  one  channel 
actual. rate  :=  0 
SEQ  j  =  [1  FOR  repetition] 
SEQ 


hostinO  ?  ch 
TIME  ?  timeO 


WORD. SLICE. OUTPUT (hostoutO,wbufferO,l, (block. size/4)) 

TIME  ?  timel [0] 
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transfer. rate  (timeO[0] , timel [01 ^1 ,block. size , rate) 
actual. rate  :=  ( (actual. rate  *  (j-l))  +  rate)/j 

SKIP 

write. number  (actual. rate) 

Screen  !  tab 

--  output/input  in  one  link 
actual. rate  :=  0 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

nostinO  ?  ch[0 

TIME  ?  timeO  0 

P  ^  R 

'''word.  SLICE.  OUTPUT  (hostoutO,wbufferO,l,  (block,  size/4) ) 
WORD .  SLICE  .  INPUT ( hos tinO  ,v;bufferl ,  1 ,  (block .  size/4)  ) 
TIME  ?  timel [0] 

transfer. rate (timeO[0] , timel [0] ,1 ,block. size  rate) 
actual. rate  :=  ( (actual. rate  *  (j-l))  +  rate)/j 
SKIP 

write. number  (actual. rate) 
Screen  !  tab 

--  output  to  two  channels 
actual. rate  :=  0 
SEO  j  =  [1  FOR  repetition] 
SEQ 
PAR 

hostinO  ?  ch[0 
hostinl  ?  ch[l 
TIMS  ?  timeO[0] 
P^R 
'  WORD. SLICE. OUTPUT(hostoutO,wbufferO,l, (block. size/4 
WORD. SLICE. OUTPUT(hostoutl,wbufferl,l, (block. size/4 
TIME  ?  timel [0] 

transfer. rate(timeO[0]  timel [0] , 1 ,block.size . rate) 
actual. rate  :=  ( (actual. rate  *  (j-l))  +  rate)/j 
SKIP 

write. number  (actual. rate) 
Screen  !  tab 

--  output/input  in  two  links 
actual. rate  :=  0 
SEO  j  =  [1  FOR  repetition] 
~SEQ 
PAR 

hostinO  ?  ch[0 
hostinl  ?  ch[l 
TIME  ?  timeO[0] 
PAR 

WORD. SLICE. OUTPUT (hos toutO,wbufferO,l, (block. siz 
WORD. SLICE. OUTPUT (hos toutl,wbufferl,l, (block. siz 
WORD . SLICE . INPUT (hostinO ,wbuffer2 , 1 , (block . size/ 
WORD. SLICE. INPUT(hostinl,wbuffer3,l, (block. size/ 
TIME  ?  timel [0] 

transfer .rate(timeO[0] ,timel[0] ,1 ,block. size .rate) 
actual. rate  :=  ( (actual. rate  *  (j-l))  +  rate)/j 
SKIP 

write. number  (actual. rate) 
Screen  !  tab 

--  output  to  three  channels 
actual. rate  :=  0 
SEQ  j  =  [1  FOR  repetition] 
SEQ 
PAR 


e/4 

e/4 

4 

4 


hostinO  ?  ch 
hostinl  ?  ch 
hostin2  ?  ch 

TIME  ?  time0[0]' 

PAR 


WORD. SLICE. OUTPUT (hos toutO,wbufferO,l, (block. size/4 
WORD. SLICE. OUTPUT (hos toutl,wbufferl,l, (block. size/4 
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WORD. SLICE. OUTPUT (hostout2,wbuffer2,l, (block. size/4)) 
TIME  ?  timel[0] 

transfer .rate (timeO[0] , timel [0] ,1 ,block. size .rate) 
actual. rate  :=  ( (actual. rate  *    (j-1))  +  rate)/j 

SKIP 

write. number  (actual. rate) 

Screen  !  tab 

--  output/input  in  three  links 
actual. rate  :=  0 
SEQ  i  =  [1  FOR  repetition] 
SEQ 
PAR 

hostinO  ?  ch[0 
hostinl  ?  ch  ^1 
hostin2  ?  ch[2 
TIME  ?  timeO[0] 
PAR 

WORD. SLICE. OUTPUT (hostoutO,wbufferO,l, (block. size/' 
WORD.SLICE.OUTPUT  hostoutl,wbufferl,l, (block. size/' 
WORD . SLICE .OUTPUT (hostout2 ,wbuffer2 , 1 , (block. size/ 
WORD . SLICE . INPUT(hostinO , wbuf f erO , 1 , (block .size/4 
WORD. SLICE. INPUT ( hostinl, wbufferl,l, (block. size/4; 
WORD . SLICE . IMPUT(hostin2 ,wbuf f er2 , 1 , (block. size/4; 
TIME  ?  timel [0 J 

transfer .rate(timeO [0] ,timel[0] , 1 ,block. size , rate) 
actual. rate  :=  ( (actual. rate  ^    (j-1))  +  rate)/j 
SKIP 

write. number  (actual. rate) 
Screen  !  tab 

--  output  to  four  channels 
actual. rate  :=  0 
SEO  j  =  [1  FOR  repetition] 
"SEQ 
PAR 

hostinO  ?  ch[0 
hostinl  ?  ch'l 
hostin2  ?  ch'2 
hostinS  ?  ch'3 


TIME  ?  timeO[0] 
PAR 

WORD. SLICE. OUTPUT (hostoutO, wbuf ferO,l, (block. size/4; 
WORD. SLICE. OUTPUT (hostoutl, wbuf ferl,l, (block. size/4, 
WORD . SLICE . OUTPUT (hostout2 , wbuf fer2 , 1 , (block. size/4 
WORD .SLICE .OUTPUT (hostoutS , wbuf fer3 , 1 , (block. size/4 
TIME  ?  timel [0] 

transfer. rate(timeO [0] , timel [0] , 1 ,block. size .rate) 
actual. rate  :=  ( (actual. rate  *  (j-1))  +  rate)/j 
SKIP 

write. number  (actual. rate) 
Screen  !  tab 


--  input  from  four  channels 
actual. rate  :=  0 

SEQ  j  =  [1  FOR  repetition] 

SEQ 
PAR 

hostinO  ?  ch 

[0] 

hostinl  ?  ch 

'l' 

hostin2  ?  ch 

'2' 

hostinS  ?  ch 

'3' 

TIME  ?  timeOrO]  ' 

PAR 

WORD . SLICE . INPUT (hostinO , wbuf ferO , 1 , (block. size/4) ) 
WORD . SLICE . INPUT (hostinl ,wbuf f erl , 1 , (block. size/4) ) 
V;ORD. SLICE .  INPUT(hostin2  ,wbuf  fer2  , 1 ,  (block,  size/4^  ^ 
WORD . SLICE . INPUT (hostin3 , wbuf fer3 , 1 , (block. size/4 
TIME  ?  timel [0] 

transfer.rate(timeG[0] , timel [0] ,1 ,block.size , rate) 
actual. rate  :=  ( (actual,  rate  '^   (j-1))  +  rate)/j 
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SKIP 

v;rite  .number    (actual,  rate) 

Screen    !    tab 

--  all  output  and  input  in  parallel 
actual. rate  :=  0 
SEQ  j  =  [1  FOR  repetition] 
SEQ 


hostinO 

7 

ch 

'01 

hostinl 

■> 

ch 

'1' 

hostinZ 

7 

ch 

'2' 

hostinS 

7 

ch 

'3" 

SKIP 


TIME  ?  timeoro] 

PAR 

WORD. SLICE. OUTPUT(hostoutO,wbufferO,l 
WORD. SLICE. OUTPUT(hostoutl,wbufferl,l 
WORD . SLICE . OUTPUT (hostout2 ,wbuffer2 , 1 
WORD.  SLICE.  OUTPUT  (hos  touts,  V'buffer3,l 
WORD . SLICE . INPUT (hostinO , wbuf f erO , 1 ,  ' 
WORD . SLICE . INPUT (hostinl , wbuf ferl , 1 , 
WORD .  SLICE  .  INPUT  (hostinZ  ,V7buffer2  , 1 , 
WORD. SLICE. INPUT(hostin3,wbu£fer3,l, 

TIME  ?  timel[0] 

transfer. rate (timeO[0] ,timel[0] ,1 , block 

actual. rate  :=  ( (actual. rate  *  (j-1))  + 


block. size/4^ 
block. size/4^ 
block. size/4^ 
.block. size/4, 
block. size/4^ 
block. size/4, 
block. size/4, 
block. size/4, 

.size . rate) 
rate)/j 


ATTENTION  !  End  of  code  shifted  12  spaces  to  the  left. 


write. number  (actual. rate) 

new. line 

SKIP 

new. line 

--  send  to  screen  operations  done  concurrently 

IF 

cpumode  =  ' 0 ' 

write. string  ("  No  other  process  running  concurrently  ") 
( ( (cpumode='  2 '■)OR(cpumode= '4'  ) ) 

OR ( ( cpumode= ' 6 ' ) OR ( cpumode= ' 8 ' ) ) ) 
SEQ 

flag  !  'a' 
counter  ?  number 

write. string  ("Number  of  operations  (in  //)  at  the  ") 
write. string  ("bOOl  transputer  ") 
write. number  (number) 
new. line 

hostinO  ?  number 

write. string  ("Number  of  operations  (in  //)  at  ") 
write. string  ("transputer  0  (b003)  ") 
write. number  (number; 
TRUE 
SEQ 

nostinO  ?  number 

write. string  ("Number  of  operations  (in  //)  at  ") 
write. string  ("transputer  0  (b003)  ") 
write. number  (number) 
new. line 
new. line    : 


SEQ   main  word. slice. transfer 

--  word  buffers  initialization 
SEQ  k  =  [1  FOR  maxwordblock.size" 
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3EQ 

wbufferO 

[k] 

:=  10000 

wbuf ferl 

■k' 

:=  20000 

wbufferZ 

•k' 

:=  30000 

v.'buf  fer3 

■k' 

:=  40000 

SKIP 

IF 


cpumode  =  ' 2 ' 
PAR 

wordtransfer  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
cpumode  =  ' 4 ' 
PR I  PAR 

wordtransfer  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
cpumode  =  ' 6 ' 
PAR 

v/ordtransfer  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
cpumode  =  '8' 
PRI  PAR 

wordtransfer  (repetition,  cpumode,  flag,  counter) 
cDubusyprod  (flag,  counter) 
TRUE 

wordtransfer  (repetition,  cpumode,  flag,  counter): 
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--  SC  PROC  get. choices  (CHAM  Keyboard,  Screen,  VAR  ch,  run) 

--  PROC  get. choices  (CHAN  Keyboard,  Screen,  VAR  ch,  cpumode,  run) 
PROC  get. choices  (CHAN  Keyboard,  Screen,  VAR  ch,  cpumode,  run)= 

presents  menus  and  gets  user's  choices  of  cpumode  and  construct 

--   modlibrary .occ 

--  io_routines .occ   (partial) 

--  SC  PROC  nev;,line  (CHAN  Screen) 
--  PROC  new. line  (CHAN  Screen) 
PROC  nev;.line  (CHAN  Screen )  = 

jumps  a  line  on  the  screen.  May  be  compiled  separately 

DEF  EndBuffer  =  -3: 
DEF  cr  =  13  : 
DEF  If  =  10  : 

SEQ 

Screen  !  cr ; If ; EndBuffer  : 

--  descriptor 

--  code 

--  SC  PROC  write  (CHAN  Screen,  VALUE  string[]) 
--  PROC  write  (CHAN  Screen,  VALUE  string[]) 
PROC. write  (CHAN  Screen,  VALUE  string[])  = 

Writes  a  given  string  to  the  screen,  in  a  byte  by  byte  fashion 

May  be  compiled  separately 

DEF  EndBuffer  =  -3: 

SEQ 

SEQ  i  =  [1  FOR  string[BYTE  0]] 

Screen  !  string[BYTE  i] 
Screen  !  EndBuffer  : 

--  descriptor 

--  code 

--  SC  PROC  clear  (CHAN  Screen) 
--  PROC  clear  (CHAN  Screen) 
PROC  clear  (CHAN  Screen)= 

clears  the  screen.  May  be  compiled  separately 

DEF  EndBuffer  =  -3: 
DEF  esc  =  27 : 

SEQ 

Screen  !  esc;  ' - ' ;  '2';  'J';  EndBuffer  clear  sequence 

Screen  !  esc;  ' [ ' ;  'H'  :  home  cursor 

--  descriptor 

--  code 

--  SC  PROC  write. number  (CHAN  Screen,  VALUE  number) 
--  PROC  write. number  (CHAN  Screen,  VALUE  number) 
PROC  write. number (CHAN  Screen,  VALUE  number)  = 

This  PROC  outputs  a  signed  integer  value  to  the  screen  ' 

Hay  be  compiled  separately 

VAR  output[16] ,  count,  X: 
SEQ 

x:=  number 

count  :=  0 

IF 

handle  special  cases 
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x=0 

Screen  !  '0' 
x<0 

SEQ 

Screen  !  '-' 

X:  =  -X 

TRUE 
SKIP 

WHILE  x>0 

construct  number 
SEQ 

output[count]  :=  (x   10)  +  '0' 
count  :=  count  +  1 

X:=  x/10 

WHILE  count  >  0 

output  number 
SEQ 

count  :=  count-1 
Screen  !  output [count] 

SKIP: 

--  descriptor 

--  code 

--  utilities. occ   (partial) 
--  SC  PROC  capitalize  (VAR  ch) 
--  PROC  capitalize  (VAR  ch) 
PROC  capitalize  (VAR  ch)  = 

capitalizes  any  lower  case  character  into  upper  case 

DEF  delta  =( 'a'  -  'A' )  : 

---  A  --->  65 

a  >  97      ASCII  values 

---  z  --->  122 

SEQ 
IF 

(ch  <=  'z' )  AND  (ch  >=  'a' ) 

ch  :=  ch  -  delta 
TRUE 

SKIP  : 

--  descriptor 

--  code 

--  global_def . tds  (partial) 
--  Constants  Definitions 
DEF  EndBuffer  =  -3 


DEF  tab 

=   9 

DEF  If 

=  10 

DEF  cr 

=  13 

DEF  esc 

=  27 

DEF  sp 

=  32 

--  PROC  write. header 
PROC  write. header  = 

writes  the  header  of  the  output  table 

SEQ 

run  :=  run  +  1 

clear(Screen) 

write(Screen,  "RUN  #  ") 

write. number  (Screen, run) 

Screen  !  sp;sp;sp 

--  output  the  cpu  mode  to  the  screen 

IF 
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cpumode  =  ' 0 ' 

write(Screen, 
cpumode  =  ' 1 ' 

write(Screen, 
cpumode  =  ' 2 ' 

v;rite(Screen, 
cpumode  =  ' 3  ' 

write(Screen, 
cpumode  =  ' 4 ' 

write(Screen, 
cpumode  =  ' 5 ' 

v/rite(Screen, 
cpumode  =  ' 6 ' 

v;rite(Screen, 
cpumode  =  ' 7 ' 

write(Screen, 
cpumode  =  '8' 

write(Screen, 
TRUE 

SKIP 


cpu  mode  =  0  (no  par  proc  ) 

cpu  mode  =  1  (one  sum  par  ) 

cpu  mode  =  2  (all  sum  par  ) 

cpu  mode  =  3  (one  sum  pripar) 

cpu  mode  =  4  (all  sum  pripar) 

cpu  mode  =  5  (one  prod  par) 

cpu  mode  =  6  (all  prod  par) 

cpu  mode  =  7  (one  prod  pripar) 

cpu  mode  =  8  (all  prod  pripar) 


Screen  !  sp;sp 

--  output  the  construct  type  to  the  screen 

IF 


ch  =    'A' 

write(Screen, 
ch  =    'B' 

v/rite(Screen, 
ch  =    'I' 

write(Screen, 
ch  =    'W 

write(Screen, 
TRUE 

SKIP 


'input/output  channels  (bytes)  ") 
'BYTE. SLICE. input/output  (bytes)  ") 
'input/output  channels  (integers)") 
'WORD . SLICE . input/output ( integers ) " ) 


2  OUT   2IN/0UT 
4IN/0UT") 


3  OUT") 


new. line  (ScreenJ 

new. line  (Screen, 

write  (Screen,  '^BYTES   1  OUT   IIN/OUT 

write(Screen,  "3IN/0UT  4  OUT   4  IN 

new. line  (Screen): 

VAR  answer  : 

run  =  0 
SEQ 

--  output  to  the  screen  presentation  of  program 

clear(Screen) 

write (Screen,  "     This  is  an  Evaluation  Program  for  ") 

write(Screen,  "the  Transputer") 

new. line  (Screen) 

write(Screen.,  "  It  is  fully  interactive  and  you  will  ") 

write{Screen,  "be  prompted  in") 

new. line  (Screen) 

v/rite (Screen,  "  each  run  to  choose  cpu  mode  and  type  ") 

write(Screen,  "of  construct  ") 

new. line  (Screen) 

write (Screen,  "  The  output  table  will  present  transfer  ") 

write(Screen,  "  rates  in  ") 

new. line  (Screen) 

write(Screen,  "  Kbits/sec  for  the  16  different  ") 

write(Screen,  "block. sizes  and  the  9") 

new. line  (Screen) 

write(Screen,  "  channel  configurations  ") 

new. line  (Screen) 

new. line  (Screen) 

write(Screen,  "  TYPE  (Y)ES   if  you  want  to  use  it  ' 

new. line  (Screen) 

write(Screen,  "      (N)0   if  you  want  to  quit  ") 

new. line  (Screen) 

ansv/er  :=  'z' 


) 
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--  validate  answer 

WHILE  ((answer  <>  'Y')  AND  (answer  <>  'N')) 
SEQ 

write(Screen,  "  Type  your  choice  ") 

Keyboard  ?  answer 

capitalize  (answer) 

Screen   !  answer 

new. line  (Screen) 

TRUE 
SKIP 
clear(Screen) 

--  choosing  type  of  construct 
--  prompt  tor' type  of  construct 

v7rite(Screen,  "  Choose  type  of  construct  to  be  used  ") 
new, line(Screen) 

write (Screen,  "  A  for  input/output  channels    (bytes)  ") 
new. line (Screen) 

write(Screen,  "   B  for  BYTE. SLICE  input/output  (bytes)") 
new. line(Screen) 

write(Screen,  "   I   for  input/output  channels   (words)  ") 
new.line(^Screen) 

write(Screen,  "  W  for  WORD. SLICE  input/output  (words)") 
new. line (Screen) 

--  validate  type  of  construct 
ch  :=  'Z' 

WHILE  (((ch  <>  'A')AND(ch  <>  'B'))AND((ch  <>  'W')AND(ch  <>  'I'))) 
SEQ 

write(Screen,  "  Type  your  choice  ") 

Keyboard  ?  ch 

capitalize  (ch) 

Screen  !  ch 

new. line(Screen) 

new. line (Screen) 

--  choosing  cpumode  during  transfers 

--  prompt  for  cpu  mode 

write(Screen,  "  Choose  cpu  mode  during  transfers") 

new. line (Screen) 

write(Screen,  "0  ->  cpus  executing  no  concurrent  processes  ") 

nev7.  line(Screen) 

write(Screen,  "1   ->  BOOS  cpus  executing  sum  concurrently  (par)") 

new.line(Screen) 

v;rite (Screen,  "2  ->  all  cpus  executing  sum  concurrently  (par)") 

new. line(Screen) 

v;rite (Screen,  "3   ->  BOOS  cpus  executing  sum  concurrently(pripar)") 

new. line (Screen) 

write(Screen,  "4  ->  all  cpus  executing  sum  concurrently  (pripar)") 

new. line (Screen) 

v;rite(Screen,  "5   ->  BOOS  cpus  executing  array  products    (par)") 

new. line (Screen) 

write(Screen,  "6  ->  all  cpus  executing  array  products   (par)") 

new. line (Screen) 

v;rite(Screen,  "7   ->  BOOS  cpus  executing  array  products  (pripar)") 

new. line (Screen) 

write (Screen,  "8  ->  all  cpus  executing  array  products   (pripar)") 

new.line(Screen) 

--  validate  cpu  mode 

cpumode   :=  10 

WHILE   ((cpumode  >  #38)  OR  (cpumode  <  #30  )) 

---  0  <  cpumode  <  8  (IN  ASCII) 
SEQ 

write (Screen,  "  Type  your  choice  ") 
Keyboard  ?  cpumode 
Screen  !  cpumode 
new. line(Screen) 

write .header : 

--  descriptor 
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--  code 

--  PROC  user. interface 
PROC  user . interface  = 

Presents  menus  and  calls  right  modules  to  be  executed 

in  the  transputer  root. 


--   constant  and  variable  declarations 

VAR  run        :  number  of  runs  made  this  time  (RUN  #) 

users  choice  in  continue  or  quit 

users  choice  of  construct 

users  choice  of  cpu  mode  while  transfering 

data 


VAR  answer 
VAR  construct 
VAR  cpumode 


SEQ 

run  :=  0 
answer  :=  'z' 
clear. screen 

write. string  ("  Do  you  want  to  use  the  Link  Evaluation  Program?") 
--  validate  answer 

WHILE  ((answer  <>  'Y')  AND  (answer  <>  'N')) 
SEQ 

new. line 

write. string  ("  Type  your  choice  (Y)  or  (N)") 

Keyboard  ?  answer 

capitalize  (ansv;er) 

Screen   !  answer 

new. line 

WHILE   answer  =  'Y' 
SEQ 

get. choices  (Keyboard,  Screen,  construct,  cpumode,  run) 

--  send  choices  to  other  transputers 

PAR 

hostoutO  !  construct;  cpumode;  repetition 
hostoutl  !  construct;  cpumode;  repetition 
hostout2  !  construct;  cpumode;  repetition 
hostoutS  !  construct;  cpumode;  repetition 

--  executing  the  right  procedure  and  prompting  for  new  run 
IF 

construct  =  'A' 

inout. transfer  (repetition,  cpumode) 
construct  =  'B' 

byte .slice. transfer  (repetition,  cpumode) 
construct  =  ' I ' 

int. transfer  (repetition,  cpumode) 
construct  =  'W' 

v/ord. slice,  transfer  (repetition,  cpumode) 
TRUE 
SKIP 
--  prompt  for  another  run  and  validate  answer 

ansv/er  :=  'Z'    to  make  the  next  loop  be  executed 

WHILE  ((answer  <>  'Y')  AND  (answer  <>  'N')) 
SEQ 

write .string( "Do  you  want  another  run?  Type  (Y)  or  (N)") 
Keyboard  ?  answer 
capitalize  (answer) 
Screen   !  ansv;er 
new. line 

--  send  answer  to  other  transputers 
PAR 

hostoutO  !  answer 

hostoutl  !  answer 

hostout2  !  answer 

hostoutS  !  ansv;er 

clear .screen 

write. string  ("  Thank  you  for  using  the  Link  Evaluation  Program") 
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new. line 

write. string  ("  Press  reset  on  the  bOOl  board  to  get  back  ") 

write. string  ("  to  VAX/VMS  ")  : 

PAR 

IMS .BOOl . terminal. driver (Keyboard, Screen, port, baud) 
user. interface : 

■-    END  OF  CODE  IN  TRANSPUTER  ROOT 
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--  TRANSPUTER0_B003.TDS 

--  SC  PROC  transferO.bOOS 
--  PROC  transferO.bOOS  (CHAN  in, out) 
PROC  transferO.bOOS  (CHAN  in, out)  = 
--  description 

This  is  the  outer  procedure  placed  on  transputer  0  .  It  contains 

global  variables  and  constants,  and  all  procedures  that  run  in  this 

transputer.  It  receives  a  construct  type  (ch),  cpu  mode  (cpumode), 

and  nua^.ber  of  times  each  communication  sequence  (repetition),  and 

calls  accordingly  one  of  the  following  procedures: 

-  io. transferO , 

-  byte . slice . transferO , 

-  int. transferO  or 

-  word. slice . transferO 

--   Link  Definitions 

DEF  linkOin  =  4 

DEF  linkOout  =  0 

DEF  linklin  =  5 

DEF  linklout  =  1 

DEF  link2in  =  6 

DEF  link2out  =  2 

DEF  linkSin  =  7 

DEF  linkSout  =  3 

--  constant  declarations 

DEF  sizetable  =  TABLE  [  1,  2,  4,  8,  16,  32,  64,  128,  256,  512, 

1024,  1280,  2048,  4096,  8192.  10000  ]: 

DEF  nr. of. sizes  =  16:        as  counted  from  above  table 

DEF  maxblock. size  =  10000:   last  from  the  above  table 

DEF  maxwordblock. size  =  maxblock. size  /  4: 

--  variables  declarations 

VAR  ch       :   —  choice  of  the  user  in  type  of  construct 

VAR  answer    :   choice  of  the  user  in  continue 

VAR  cpumode    :   choice  of  the  user  in  cpu  operation  concurrently 

VAR  repetition:   choice  of  the  user  in  number  of  times  to  run 

--  SC  PROC  cpubusysum  (CHAN  f lagl ,  counterchan) 
--  PROC  cpubusysum  (CHAN  f lagl , counterchan) 
PROC  cpubusysum  (CHAN  flagl ,counterchan)= 
--  description 

It  keeps  the  cpu  working  in  parallel  (time  sharing)  with  link 

transfers  by  doing  sum  operations.  It  Stops  when  receives 

a  flag  by  the  channel  flagl  from  the  procedure  transfer  that 

is  being  executed  concurrently. 

Outputs  by  channel  counterchan  number  of  operations  done. 

VAR  a,b,e/ 

working, 
counter, 
ch  : 

SEQ 

counter  :=  0 
working  :=  TRUE 
TIME  ?  a 
WHILE  working 
ALT 

flagl  ?  ch 

working  :=  FALSE 
TIME  ?  b 
SEQ 

e  :=  a  +  b 

counter  :=  counter  +  1 
counterchan  !  counter: 

--  descriptor 
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--  code 

--  SC  PROC  cpubusyprod  (CHAN  flagl ,counterchan) 
--  PROC  cpubusyprod  (CHAN  flagl , counterchan) 
PROC  cpubusyprod  (CHAN  flagl , counterchan) = 
--  description 

It  keeps  the  cpu  working  in  parallel(time  sharing)  with  the  link 

transfers  by  doing  array  multiplications.  It  stops  when  receives 

a  flag  by  the  channel  flagl  from  the  transfer  procedure,  that  is 

being  executed  concurrently.  It  outputs  by  channel  counterchan 

the  number  of  operations  done. 

--   constants  and  variable  declarations 

DEF  number  =  100:    size  of  array 

VAR  a [number  +  1],   array  of  integers 


+  1], 


number  +  I],  array  of  integers 

number  +  1] ,  array  of  integers 

clock,  integer  -variable  to  get  time 

working,  boolean  -to  stop  execution 

counter,  integer  -number  of  operations  done 


ch 

SEQ 

--  initialize  buffers  and  variables 
SEQ  i  =  [  1  FOR  number  ] 
SEQ 

a[i]  :=  3*i 

b[i]  :=  5*i 
SKIP 

counter  :=  0 
working  :=  TRUE 

WHILE  working 
ALT 

flagl  ?  ch 

working  :=  FALSE 
TIME  ?  clock 
SEQ 

SEQ  i  =  [1  FOR  number] 

e[i]  :=  a[i]  *  b[i]   ^        ,         ,       . 
counter  :=  counter  +  number  — updates  nr.  of  operations 
counterchan  !  counter: 
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--  PROC  inout. transferO  (VALUE  repetition, cpumode) 
PROC  inout . transferO  (VALUE  repetition, cpumode)= 
--  description 

Initializes  the  buffers  and  executes  the  procedures  iotransfer, 

plus,  when  applicable  according  to  cpumode,  one  of  the  following: 

cpubusy.prod  or  cpubusy.sum. 

Uses  global  constant  maxblock.size 

CHAN  flag,     flags  the  cpu  to  stop 

counter  :  return  the  number  of  operations  cpu  did 

--  PROC  iotransferO  (VALUE  repetition,  cpumode,  CHAN  flag,  counter) 
PROC  iotransferO  (VALUE  repetition, cpumode, CHAN  done,  opnumber)= 
--  description. io 

It  executes  sequentially  several  parallel  transfers  using  the 

input/output  primitives  to/from  transputer  root. 

It  uses  the  global  constants  sizetable,  nr. of. sizes,  repetition 

--  variable  declarations 
VAR  block. size , 

number,   the  number  of  operations  done  by  the  cpu. 


ch[4]  : 
VAR  bufferO 
VAR  bufferl 


BYTE  maxblock.size  +  1 
BYTE  maxblock.size  +  1 


SEO 

--  initialize  buffers 
SEQ  i  =  [1  FOR  maxblock.size] 
SEQ 

bufferO-BYTE  i-  :=  i\8 
bufferl-BYTE  i-  :=  i\8 
SKIP 

SEQ  i  =  [0  FOR  nr. of. sizes] 
SEQ 

block. size  :=  sizetableTi] 
--  input  and  output  handling 
--  input  from  one  channel 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  block. size] 
in  ?  bufferO[BYTE  k] 
SKIP 

--  input/output  to/from  one  link 
SEQ  j  =  [1  -FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  block. size] 
PAR 

in  ?  bufferO[BYTE  k] 
out  !  bufferl [BYTE  k] 
SKIP 

--  input  from  two  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  block. size] 
in  ?  bufferO[BYTE  k] 
SKIP 

--  input/output  to  two  links 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 
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SEQ  k  =  [1  FOR  block. size] 

PAR 

in  ?  bufferO[ByTE  k] 
out  !  bufferl[BYTE  k] 
SKIP 

--  inout  from  three  channels 
SEQ  j' =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  block. size] 
m  ?  bufferO[BYTE  k] 
SKIP 

--  input/output  to  three  links 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  block. size] 
PAR 

out  !  bufferO[BYTE  k] 
in  ?  bufferl[BYTE  k] 
SKIP 

--  input  from  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  block. size] 
in  ?  bufferO[BYTE  k] 
SKIP 

--  output  to  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  block. size] 
out  !  bufferO[BYTE  k] 
SKIP 

--  all  output  and  input  in  parallel 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  block. size] 
PAR 

in  ?  bufferO[BYTE  k] 
out  !  bufferl[BYTE  k] 
SKIP 

SKIP 
IF 

--  cpumode  not  =  '0'  then  get  the  number  of  computations  done 
cpumode  <>  ' 0 ' 
SEQ 

flag  !  'a'      —  flag  to  stop  procedure  cpubusy 

opnumber  ?  number  receiving  computations  from  cpubusy 

out  !  number   sending  computations  to  transputer  root 

TRUE 

SKIP   : 

--  main  PROC  inout. transferO 
IF 

((cpumode  =  '1')  OR  (cpumode  =  '2')) 
PAR 

iotransferO  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '3')  OR  (cpumode  =  '4')) 
PRI  PAR 

iotransferO  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '5')  OR  (cpumode  =  '6')) 
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PAR 

iotransferO  f repetition,  cpumode,  flag,  counter) 

cpubusyprod  (flag,  counter) 

((cpumode  =  '7')  OR  (cpumode  =  '8')) 
PRI  PAR 

iotransferO  (repetition,  cpumode,  flag,  counter) 

cpubusyprod  (flag,  counter) 
TRUE 

iotransferO  (repetition,  cpumode,  flag,  counter): 
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--  PROC  byte. slice. transferO  (VALUE  repetition, cpumode) 
PROC  byte . slice . transferO  (VALUE  repetition, cpumode)= 
--  description. 

Initializes  the  buffers  and  executes  the  procedures  transfer, 

plus,  when  applicable  according  to  cpumode,  one  of  the  following: 

cpubusy.prod  or  cpubusy.sum. 

Uses  global  constant  maxblock. size 


CHAN  flag, 

counter 


flags  the  cpu  to  stop 
return  the  number  of  operations  cpu  did 


--  PROC  transfei 
PROC  transferO 
--  description 

A:k7<■k:k^<■k■k■ki^■ki^iKiK■kir■ki^■k■k■k■k■k■k:k■ki<■k■k■kir■k■k:ki^■*:if:ki^■ki<■ki^^ 

It  executes 

---  BYTE. SLICE 
It  uses  the 

_--  -k-k-k-k-k-k-k-kyr-k-k 


;rO  (VALUE  repetition,  cpumode,  CHAN  flag,  counter) 
(VALUE  repetition, cpumode, CHAN  done,  opnumber)= 


sequentially  several  parallel  transfers  using  the 
procedures   to/from  transputer  root. 
global  constants  sizetable,  nr. of. sizes,  repetition 

kT^kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk^kkkkkkk 


--   variable  declarations 
VAR  block. size, 

number,   the  number  of  operations  done  by  the  cpu, 


ch[4] : 
VAR  bufferO 
VAR  bufferl 


BYTE  maxblock. size  +  1 
BYTE  maxblock. size  +  1 


SEQ 


--  initialize  buffers 
SEQ  i  =  [1  FOR  maxblock. size] 
SEQ 

bufferO-BYTE  i-  :=  i\8 

bufferl-BYTE  i-  :=  i\8 
SKIP 

SEQ  i  =  [0  FOR  nr. of. sizes] 
SEQ 

block. size  :=  sizetable [i] 
--  input  and  output  handling 
--  input  from  one  channel 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  i  'a' 

BYTE . SLICE . INPUT ( in , buf f erO , 1 , block . size ) 
SKIP 

--  input/output  to  one  channel 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 
PAR 

BYTE. SLICE. INPUT (in, buf ferO,l, block. size) 
BYTE. SLICE. OUTPUT (out, buffer 1,1, block. size) 
SKIP 

--  input  from  two  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

BYTE. SLICE. INPUT (in, buf ferO,l, block. size) 
SKIP 

--  input/output  to  two  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 
PAR 

BYTE. SLICE. INPUT (in, buf ferO,l, block. size) 
BYTE. SLICE. OUTPUT (out, buffer 1,1, block. size) 
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SKIP 

--  input  from  three  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

BYTE. SLICE. INPUT ( in, bufferO,!, block. size) 
SKIP 

--  inout/output  to  three  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 
PAR 

BYTE. SLICE. INPUT ( in, bufferO,!, block. size) 
BYTE. SLICE. OUTPUT (out, bufferl,l, block. size) 
SKIP 

--  input  from  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEO 

out  !  'a' 

BYTE. SLICE. INPUT ( in, bufferO,l, block. size) 
SKIP 

--  output  to  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

BYTE . SLI CE . OUTPUT ( out , bufferO , 1 , block . s  ize ) 
SKIP 

--  all  output  and  input  in  parallel 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 
PAR 

BYTE . SLICE . INPUT ( in , bufferO , 1 , block . size ) 
BYTE. SLICE. OUTPUT (out, bufferl,l, block. size) 
SKIP 

SKIP 
IF 

--  cpumode  not  =  '0'  then  get  the  number  of  computations  done, 
cpumode  <>  '0' 
SEQ 

flag  !  'a' 
opnumber  ?  number 
out  !  number 

TRUE 

SKIP   : 

--  main  PROC  byte . slice . transferO 
IF 

((cpumode  =  '1')  OR  (cpumode  =  '2')) 
PAR 

transferO  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '3')  OR  (cpumode  =  '4')) 
PRI  PAR 

transferO  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '5')  OR  (cpumode  =  '6')) 
PAR 

transferO  (repetition,  cpumode,  flag,  counter) 
CDubusyproa  ('flag,  counter) 
((cpumode  =  '7')  OR  (cpumode  =  '8')) 
PRI  PAR 

transferO  (repetition,  cpum.ode ,  flag,  counter) 
cpubusyprod  (flag,  counter) 
TRUE 

transferO  (repetition,  cpumode,  flag,  counter): 

147 


--  PROC  int. transferO  (VALUE  repetition, cpumode) 
PROC  int. transferO  (VALUE  repetition, cpumode)= 
--  description. 

Initializes  the  buffers  and  executes  the  procedures  intransfer, 

plus,  v;hen  applicable  according  to  cpumode,  one  of  the  following: 

cpubusy.prod  or  cpubusy . sum. 

Uses  global  constant  maxv;ordblock.size 

CHAN  flag,     flags  the  cpu  to  stop 

counter  :  return  the  number  of  operations  cpu  did 

--  PROC  intransferO  (VALUE  repetition,  cpumode,  CHAN  flag,  counter) 
PROC  intransferO  (VALUE  repetition, cpumode , CHAN  done,  opnumber)= 
--  description 

It  executes  sequentially  several  parallel  transfers  using  the 

input/output  primitives  to/from  transputer  root. 

It  uses  the  global  constants  sizetable,  nr. of. sizes,  repetition 

--   variable  declarations 
VAR  block. size, 

number,   —  the  number  of  operations  done  by  the  cpu. 


ch[4] 
VAR  wbuffarO 
VAR  wbufferl 


maxwordblock.size  +  1 
maxwordblock. size  +  1 


SEQ 

--  initialize  buffers 
SEQ  i  =  [1  FOR  maxwordblock.size] 
SEQ 

wbufferOfi]  :=  i 

v;bufferl[ij  :=  i 
SKIP 

SEQ  i  =  [0  FOR  nr. of. sizes] 
IF 

sizetable[i]  <  4 

SKIP  minimum  number  of  bytes  is  4  for  integer  transfer 

TRUE 
SEQ 

Block. size  :=  sizetable [i] 
--  input  and  output  handling 
--  input  from  one  channel 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  (.block. size/4)] 
in  ?  wbufferO[k] 
SKIP 

--  input/output  to  one  link 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  (block. size/4)] 
PAR 

in  ?  wbufferO[kl 
out  !  wbufferl [k] 
SKIP 

--  input  from  two  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  (Iblock.  size/4)] 
in  ?  wbufferOlk] 
SKIP 
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--  input/output  to  two  links 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  (block. size/4)] 
PAR 

in  ?  wbufferO[k] 
out  !  wbufferl[k] 
SKIP 

--  input  from  three  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  (block. size/4)] 
m  ?  wbufferO[k] 
SKIP 

--  input/output  to  three  links 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  (block. size/4)] 
PAR 

in  ?  wbufferO[kl 
out  !  wbufferl[k] 
SKIP 

--  input  from  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  (block. size/4)] 
in  ?  wbufferO[k] 
SKIP 

--  output  to  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  (block. size/4)] 
out  !  wbufferO[k] 
SKIP 

--  all  output  and  input  in  parallel 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [I  FOR  (block. size/4)] 
PAR 

in  ?  wbufferO[kJ 
out  !  wbufferl[k] 
SKIP 

SKIP 
IF 

--  cpumode  not  =  '0'  then  get  the  number  of  computations  done, 
cpumode  <>  '0' 
SEQ 

flag  !  'a' 
opnumber  ?  number 
out  !  number 

TRUE 

SKIP   : 

--  main  PROC  int . transferO 
IF 

((cpumode  =  '1')  OR  (cpumode  =  '2')) 
PAR 

intransferO  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
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((cpumode  =  '3')  OR  (cpumode  =  '4')) 
PRI  PAR 

intransferO  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '5')  OR  (cpumode  =  '6')) 
PAR 

intransferO  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter; 
((cpumode  =  '7')  OR  (cpumode  =  '8')) 
PRI  PAR 

intransferO  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter; 
TRUE 

intransferO  (repetition,  cpumode,  flag,  counter): 
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--  PROC  word. slice. transferO  (VALUE  repetition, cpumode) 
PROC  v/ord. slice.  transferO  (VALUE  repetition, cpumode)  = 
--  description. 

Initializes  the  buffers  and  executes  the  procedures  wordtransfer , 

plus,  when  applicable  according  to  cpumode,  one  of  the  following: 

cpubusy.prod  or  cpubusy. sum. 

Uses  global  constant  maxwordblock.size 

CHAN  flag,     flags  the  cpu  to  stop 

counter  :  return  the  number  of  operations  cpu  did 

--  PROC  wordtransferO  (VALUE  repetition,  cpumode,  CHAN  flag,...) 
PROC  wordtransferO  (VALUE  repetition, cpumode, CHAN  done,  opnumber)= 
--  description 

It  executes  sequentially  several  parallel  transfers  using  the 

WORD. SLICE  procedures   to/from  transputer  root. 

It  uses  the  global  constants  sizetable,  nr. of. sizes,  repetition 

--   variable  declarations 
VAR  block. size , 

number,   the  number  of  operations  done  by  the  cpu. 


chu; 

VAR  wbufferO 
VAR  wbufferl 


maxwordblock.size  +  1 
maxwordblock.size  +  1 


SEQ 

--  initialize  buffers 
SEQ  i  =  [1  FOR  maxwordblock.size] 
^EQ 

wbufferOfi]  :=  i 

wbufferl [i]  :=  i 
SKIP 

SEQ  i  =  [0  FOR  nr. of. sizes] 
IF 

sizetable[i]  <  4 

SKIP  minimum  number  of  bytes  is  4  for  integer  transfer 

TRUE 
SEQ 

Block. size  :=  sizetableTi] 
--  input  and  output  handling 
--  input  from  one  channel 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

WORD .SLICE . INPUT ( in , wbufferO , 1 , (block .size/4)) 
SKIP 

--  input/output  to  one  link 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 
PAR 

WORD. SLICE. INPUT ( in, wbufferO,l, (block. size/4) ) 
WORD. SLICE. OUTPUT(out,wbufferl,l, (block. size/4) ) 
SKIP 

--  input  from  two  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

WORD. SLICE. INPUT( in, wbufferO,l, (block. size/4)) 
SKIP 

--  input/output  to  two  links 
SEQ  j  =  [1  FOR  repetition] 
SEQ 
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out  !  'a' 

PAR 

WORD. SLICE. INPUT ( in, wbufferO,!, (block. size/4)) 
WORD. SLICE. OUTPUT (out, wbufferl,l, (block. size/4) ) 
SKIP 

--  input  from  three  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

WORD. SLICE. INPUT ( in, wbufferO,l, (block. size/4) ) 
SKIP 

--  input/output  to  three  links 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 
PAR 

WORD . SLICE . INPUT ( in , WbufferO , 1 , (block. size/4) ) 
WORD. SLICE. OUTPUT (out, wbufferl,!, (block. size/4) ) 
SKIP 

--  input  from  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

WORD . SLICE . INPUT (in , WbufferO , 1 , (block . size/4 ) ) 
SKIP 

--  output  to  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

WORD. SLICE. OUTPUT (out, wbufferO,l, (block. size/4) ) 
SKIP 

--  all  output  and  input  in  parallel 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 
PAR 

WORD . SLICE . INPUT ( in , WbufferO , 1 , (block .size/4) ) 

WORD. SLICE. OUTPUT (out, wbufferl,!, (block. size/4)) 
SKIP 

SKIP 
IF 

--  cpumode  not  =  '0'  then  get  the  number  of  computations  done. 
cpumode  <>  ' 0 ' 
SEQ 

flag  !  'a' 
opnumber  ?  number 
out  !  number 

TRUE 

SKIP   : 

--  main  PROC  word. slice . transferO 
IF 

((cpumode  =  '1')  OR  (cpumode  =  '2')) 
PAR 

wordtransferO  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '3')  OR  (cpumode  =  '4')) 
PRI  PAR 

wordtransferO  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '5')  OR  (cpumode  =  '6')) 
PAR 

wordtransferO  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
((cpumode  =  '7')  OR  (cpumode  =  '8')) 
PRI  PAR 
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wordtransferO  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
TRUE 

wordtransferO  (repetition,  cpumode,  flag,  counter): 

--  procedure  body  transferO.bOOS 
SEQ* 

answer  :=  'Y' 
WHILE   answer  =  'Y' 
SEQ 

m  ?  ch 

in  ?  cpumode 

in  ?  repetition 

IF 

ch  =  'A' 

inout . transf erO  ( repetition , cpumode ) 
ch  =  'B' 

byte .slice. transf erO  (repetition, cpumode) 
ch  =  'I' 

int . transf erO  ( repetition , cpumode ) 
ch  =  'W 

word. slice. trans ferO  (repetition, cpumode) 
TRUE 
SKIP 
in  ?  answer  : 

•-    END  OF  CODE  IN  TRANSPUTER  0  B003 
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•-  TRANSPUTERl  B003.TDS 


--  SC  PROC  transfe 
--  PROC  transferl. 
PROC  transferl. boo 
--  description 

This  is  the  ou 

global  variabl 

transputer.  It 

and  number  of 

calls  accordin 

-  io.tr 

-  byte. 

-  int.t 

-  word. 

---  •k'kk-kk'kk-kk-kkkkk 


rl.bOOS 

b003  (CHAN  in, out) 

3  (CHAN  in, out)  = 

kkkkkkkk-kkkkkkkkkkk-kkk:k-kk-k-k-k-k-kk-kkk-k-k'k-k-k-k-k:k-kkkkkk-kkk-kk 

ter  procedure  placed  on  transputer  1  .  It  contains 
es  and  constants,  and  all  procedures  that  run  in  this 

receives  a  construct  type  (ch),  cpu  mode  (cpumode), 
times  each  communication  sequence  (repetition),  and 
gly  one  of  the  follov/ing  procedures: 
ansferl , 

slice . transferl , 
ransferl  or 
slice . transferl 

:kkk-kkkkkkkkkkkk:k-k-kk-k-k-k-k-k-k-k-k-kkk-k-k:k-k-k-k-k-k-k-k-k-k-ki<:-k-k-k-k-k-kk-kk 


--  Link  Definitions 
DEF  linkOin  =  4 
DEF  linkOout  =  0 
DEF  linklin  =  5 
DEF  linklout  =  1 
DEF  link2in  =  6 
DEF  link2out  =  2 
DEF  link3in  =  7 
DEF  link3out  =  3 

--  constant  declarations 
DEF  sizetable  =  TABLE  [  1 


2,  4,  8,  15,  32,  64,  128,  256,  512, 
1024,  1280,  2043,  4096,  8192   10000 

as  counted  from  above  table 

last  from  the  above  table 


DEF  nr. of. sizes  =  16: 

DEF  maxblock.size  =  10000; 

DEF  maxwordblock.size  =  maxblock. size/4 : 

--  variable  declarations 

VAR  ch        :    choice  of  the  user  in  type  of  construct 

—  choice  of  the  user  in  continue 

choice  of  the  user  in  cpu  operation  concurrently 

choice  of  the  user  in  number  of  times  to  run 


VAR  answer 
VAR  cpumode 
VAR  repetition; 

--  SC  PROC  cpubusysum  (CHAN  f lagl ,  counterchan) 
--  PROC  cpubusysum  (CHAN  f lagl , counterchan) 
PROC  cpubusysum  (CHAN  f lagl ,counterchan)= 
--  description 

kkkk7<kkkk-k-ki^k-kk-kkk-k-k-k-k-kk-k-kk-kk*k-kkkki^-k-k-k-kkk-kkk-kk:k-kk-k-kk-k*-k-k-k-k-kk'k-k-k 

It  keeps  the  cpu  working  in  parallel  (time  sharing}  with  link 

transfers  by  doing  sum  operations  .  It  Stops  when  it  receives 

—  a  flag  by  the  channel  flagl  from  the  transfer  procedure  that  is 

being  executed  concurrently.  It  Outputs  by  channel  counterchan 

the  number  of  operations  done. 

kkk-kkkkkkk-k-kkkkl<-ki^-k-k-k-kkk-k:k-ki<:k-k-k-k-k-k-kkkkk-k-k-k-k:k-k-kk:kk-kk-ki<-k-kk-k-kkkkk-k-k 


VAR  a,b,e/ 

working, 
counter, 
ch  : 

SEQ 

counter  :=  0 
working  :=  TRUE 
TIME  ?  a 
WHILE  working 
ALT 

flagl  ?  ch 

working  :=  FALSE 
TIME  ?  b 
SEQ 

e  :=  a  +  b 

counter  :=  counter  +  1 
counterchan  !  counter: 
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--  descriptor 

--  code 

--  SC  PROC  cpubu 
--  PROC  cpubusyp 
PROC  cpubusyprod 
--  description 

___    -k -k  ■k  ■k  y^ -k  7<  ■k -k 'k  ■k 

It  keeps  th 

transfers  b 

a  flag  by  t 

being  execu 

the  number 

___    kkkkkkkkkkk 


syprod  (CHAN  flagl ,counterchan) 
rod  (CHAN  flagl ,counterchan) 
(CHAN  flagl ,counterchan)= 

kkkkkkkkk-kkkkkkkkkkkk-k-k-k-k-k-kkkkkkkkkkkkk-k:k-k-k-k-k**:-k-k:k-k:k-k 

e  cpu  working  in  parallel(time  sharing)  with  the  link 
y  doing  array  multiplications.  It  stops  when  receives 
he  channel  flaql  from  the  transfer  procedure,  that  is 
ted  concurrently.  It  outputs  by  channel  counterchan 
of  operations  done. 

kkkkT^kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk 


•-   constants  and  variable  declarations 


DEF  number  =  100: 
VAR  a [number  +  1], 


b [number 
e [number 
clock, 
working, 
counter, 
ch  : 


—  size  of  array 

—  array  of  integers 

array  of  integers 

array  of  integers 

integer  -variable  to  get  time 

boolean  -to  stop  execution 

integer  -number  of  operations  done 


SEQ 


initialize  buffers  and  variables 
SEQ  i  =  [  1  FOR  number  ] 
SEQ 

a[i]  :=  3*i 
b[i]  :=  5^i 
SKIP 

counter  :=  0 
working  :=  TRUE 

WHILE  working 
ALT 

flagl  ?  ch 

working  :=  FALSE 
TIME  ?  clock 
SEQ 

SEQ  i  =  [1  FOR  number] 
e[i]  :=  a[i]  *  b[i] 
counter  :=  counter  +  number  -• 
counterchan  !  counter: 


•updates  nr.  of  operations 
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--  PROC  inout. transferl  (VALUE  repetition, cpumode) 
PROC  inout. transferl  (VALUE  repetition, cpumode)= 
--  description 

Initializes  the  buffers  and  executes  the  procedures  iotransfer, 

plus,  when  applicable  according  to  cpumode,  one  of  the  follov;ing: 

cpubusv.prod  or  cpubusy.sum. 

Uses  global  constant  maxblock.size 

CHAN  flag,     flags  the  cpu  to  stop 

counter  :  —  return  the  number  of  operations  cpu  did 

--  PROC  iotransferl  (VALUE  repetition, cpumode , CHAN  done,opnumber) 
PROC  iotransferl  (VALUE  repetition, cpumode ,  CHAN  done,  opnumber)= 
--  description 

It  executes  sequentially  several  parallel  transfers  using  the 

input/output  primitives  to/from  transputer  root. 

It  uses  the  global  constants:   sizetable,  nr. of. sizes,  repetition 

--   variable  declarations 
VAR  block. size, 

number,   the  number  of  operations  done  by  the  cpu. 


chr4] 
VAR  bufferO 
VAR  bufferl 


BYTE  maxblock.size  +  1 
BYTE  maxblock.size  +  1 


SEQ 

--  initialize  buffers 
SEQ  i  =  [1  FOR  maxblock.size] 
SEO 

BufferO-BYTE  i-  :=  i\8 
bufferl-BYTE  i-  :=  i\8 
SKIP 

SEQ  i  =  [0  FOR  nr. of. sizes] 
SEQ 

block. size  :=  sizetableTi] 
--  input  and  output  handling 
--  input  from  two  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  block. size] 
m  ?  bufferO[BYTE  k] 
SKIP 

--  input/output  to  two  links 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  block. size] 
PAR 

in  ?  bufferO[BYTE  k] 
out  !  bufferl [BYTE  k] 
SKIP 

--  input  from  three  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  block. size] 
m  ?  bufferO[BYTE  k] 
SKIP 

--  input/output  to  three  links 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 
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SEQ  k  =  [1  FOR  block. size] 

PAR 

in  ?  bufferO[BYTE  kl 
out  !  bufferl[BYTE  k] 
SKIP 

--  input  from  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  block. size] 
m  ?  bufferO[BYTE  k] 
SKIP 

--  output  to  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  block. size] 
out  !  bufferO[BYTE  k] 
SKIP 

--  all  output  and  input  in  parallel 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEO  k  =  [1  FOR  block. size] 
PAR 

in  ?  bufferO[BYTE  kl 
out  !  bufferl[BYTE  k] 
SKIP 

SKIP 
IF 

--  cpumode  NOT  =  '0'  then  get  the  number  of  computations  done 
cpumode  <>  '0' 
SEQ 

done  !  'a' 
opnumber  ?  number 

TRUE 

SKIP   : 

--  main  PROC  inout . transferl 
IF 

((cpumode  =  '1')  OR  (cpumode  =  '2')) 
PAR 

iotransferl  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '3')  OR  (cpumode  =  '4')) 
PRI  PAR 

iotransferl  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '5')  OR  (cpumode  =  '6')) 
PAR 

iotransferl  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
((cpumode  =  '7')  OR  (cpumode  =  '8')) 
PRI  PAR 

iotransferl  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
TRUE 

iotransferl  (repetition,  cpumode,  flag,  counter): 
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--  PROC  byte. slice. transferl  (VALUE  repetition, cpumode) 
PROC  byte . slice . transferl  (VALUE  repetition, cpumode)= 
--  description 

Initializes  the  buffers  and  executes  the  procedures  transfer, 

plus,  v;hen  applicable  according  to  cpumode,  one  of  the  following: 

cpubusy.prod  or  cpubusy .sum. 

Uses  global  constant  raaxblock. size 

CHAN  flag,     flags  the  cpu  to  stop 

counter  :  return  the  number  of  operations  cpu  did 

--  PROC  transferl (VALUE  repetition,  cpumode,  CHAN  done,  opnumber) 
PROC  transferl  (VALUE  repetition, cpumode , CHAN  done,  opnumber)= 
--  description 

It  executes  sequentially  several  parallel  transfers  using  the 

BYTE. SLICE  procedures  to/from  transputer  root. 

It  uses  the  global  constants:   sizetable,  nr. of. sizes, 

repetition 

--   variable  declarations 
VAR  block. size, 

number,   the  number  of  operations  done  by  the  cpu. 


ch[4] 
VAR  bufferO 
VAR  bufferl 


BYTE  maxblock.size  +  1" 
BYTE  maxblock.size  +  1 


SEQ 

--  initialize  buffers 
SEQ  i  =  [1  FOR  maxblock.size] 
SEQ 

bufferO-BYTE  i-  :=  i\8 
bufferl-BYTE  i-  :=  i\8 
SKIP 

SEQ  i  =  [0  FOR  nr. of. sizes] 
SEQ 

Block. size  :=  sizetableTi] 
--  input  and  output  hanalmg 
--  input  from  two  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

BYTE. SLICE. INPUT( in, bufferO,!, block. size) 
SKIP 

--  input/output  to  two  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 
PAR 

BYTE . SLICE . INPUT(in,buf f erO , 1 , block. size) 
BYTE . SLICE . OUTPUT ( out , bufferl , 1 , block . size ) 
SKIP 

--  input  from  three  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

BYTE . SLICE . INPUT (in, bufferO , 1 , block. size) 
SKIP 

--  input/output  to  three  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 
PAR 

BYTE. SLICE. INPUT (in, bufferO,!, block. size) 
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BYTE. SLICE. OUTPUT (out, bufferl,!, block. size) 
SKIP 

--  input  from  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

BYTE. SLICE. INPUT ( in, bufferO,l, block. size) 
SKIP 

--  output  to  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

BYTE .SLICE . OUTPUT (out , buff erO , 1 , block . size ) 
SKIP 

--  all  output  and  input  in  parallel 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 
PAR 

BYTE . SLI CE . INPUT ( in , buf f erO , 1 , block . size ) 
BYTE. SLICE. OUTPUT (out, bufferl,l, block. size) 
SKIP 

SKIP 
IF 

--  cpumode  not  =  '0'  then  get  the  number  of  computations  done, 
cpumode  <>  '0' 
SEQ 

done  !  'a' 
opnumber  ?  number 

TRUE 

SKIP   : 

--  main  byte . slice . transferl 
IF 

((cpumode  =  '1')  OR  (cpumode  =  '2')) 
PAR 

transferl  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '3')  OR  (cpumode  =  '4')) 
PRI  PAR 

transferl  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cDumode  =  '5')  OR  (cpumode  =  '6')) 
PAR 

transferl  (repetition,  cpumode,  flag,  counter) 
cpubusyproa  (flag,  counter) 
((cpumode  =  '7')  OR  (cpumode  =  '8')) 
PRI  PAR 

transferl  (repetition,  cpumode,  flag,  counter) 
cpubusyproa  (flag,  counter) 
TRUE 

transferl  (repetition,  cpumode,  flag,  counter): 


159 


--  PROC  int. transferl  (VALUE  repetition, cpumode) 
PROC  int. transferl  (VALUE  repetition, cpumode)= 
--  description 

Initializes  the  buffers  and  executes  the  procedures  intransfer, 

plus,  v/hen  applicable  according  to  cpumode,  one  of  the  follov/ing: 

cpubusy.prod  or  cpubusy.sum. 

Uses  global  constant  maxblock. size 

CHAN  flag,     flags  the  cpu  to  stop 

counter  :  return  the  number  of  operations  cpu  did 

--  PROC  intransferl  (VALUE  repetition,  cpumode,  CHAN  done,  ...) 
PROC  intransferl  (VALUE  repetition, cpumode , CHAN  done,  opnumber)= 
--  description 

It  executes  sequentially  several  parallel  transfers  of  integers 

using  the  input/output  primitives  to/from  transputer  root. 

It  uses  the  global  constants:   sizetable,  nr. of. sizes, 

repetition 

--   variable  declarations 
VAR  block. size , 

number,   the  number  of  operations  done  by  the  cpu. 


ch[4] 
VAR  wbufferO 
VAR  wbufferl 


maxwordblock.size  +  1 
maxwordblock. size  +  1 


SEQ 

--  initialize  buffers 
SEQ  i  =  [1  FOR  maxwordblock.size] 
SEQ 

V7buffer0[i]  :=  i 
wbufferl [ij  :=  i 
SKIP 

SEQ  i  =  [0  FOR  nr. of. sizes] 

sizetable[i]  <  4 

SKIP  minimum  number  of  bytes  is  4  for  integer  transfer 

TRUE 
SEO 

"block. size  :=  sizetablefi] 
--  input  and  output  handling 
--  input  from  two  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  (block. size/4)] 
in  ?  wbufferO[k] 
SKIP 

--  input/output  to  two  links 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  (block. size/4)] 
PAR 

in  ?  wbufferO [k] 
out  !  wbufferl [k] 
SKIP 

--  input  from  three  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  (_block. size/4)] 
m  ?  wbufferO[k] 
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SKIP 

--  input/output  to  three  links 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEO  k  =  [1  FOR  (block. size/4)] 
~PAR 

in  ?  wbufferOfk] 
out  !  wbufferl[k] 
SKIP 

--  input  from  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  (block. size/4)] 
m  ?  wbufferO[k] 
SKIP 

--  output  to  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  (block. size/4)] 
out  !  wbufferO[k] 
SKIP 

--  all  output  and  input  in  parallel 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  (block. size/4)] 
PAR 

in  ?  wbufferOrk] 
out  !  wbufferl[k] 
SKIP 

SKIP 
IF 

--  cpumode  not  =  '0'  then  get  the  number  of  computations  done 
cpumode  <>  '0' 
SEQ 

done  !  ' a ' 
opnumber  ?  number 

TRUE 

SKIP   : 

—  main  PROC  int . transferl 
IF 

((cDumode  =  '1')  OR  (cpumode  =  '2')) 
PAR 

intransferl  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '3')  OR  (cpumode  =  '4')) 
PRI  PAR 

intransferl  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '5')  OR  (cpumode  =  '6')) 
PAR 

intransferl  (repetition,  cpumode,  flag,  counter) 
coubusyprod  (flag,  counter) 
((cpumode  =  '7')  OR  (cpumode  =  '8')) 
PRI  PAR 

intransferl  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
TRUE 

intransferl  (repetition,  cpumode,  flag,  counter): 
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--  PROC  word. slice. transferl  (VALUE  repetition, cpumode) 
PROC  word. slice. transferl  (VALUE  repetition, cpumode)= 
--  description 

—  Initializes  the  buffers  and  executes  the  procedures  wordtransfer , 
plus,  v;hen  applicable  according  to  cpumode,  one  of  the  following.- 

—  cpubusy.prod  or  cpubusy .sum. 
Uses  global  constant  maxblock. size 

CHAN  flag,     flags  the  cpu  to  stop 

counter  :  return  the  number  of  operations  cpu  did 

--  PROC  wordtransferl  (VALUE  repetition,  cpumode,  CHAN  done,  ...) 
PROC  wordtransferl  (VALUE  repetition, cpumode, CHAN  done,  opnumber)= 
--  description 

It  executes  sequentially  several  parallel  transfers  of  integers 

using  the  WORD  SLICE  procedure  to/from  transputer  root. 

It  uses  the  global  constants:   sizetable,  nr. of. sizes, 

repetition 

•k7^:7^;y<■k7^i^:■k:ki<::k■k■k:k■k^^i^■k■k■k7<■k:k^ki<■ki^:•k■k■k■k■ki<:k•k•k^i<■)^;i<i<i<■ki<i^•k■k^ 

--   variable  declarations 
VAR  block. size, 

number,   the  number  of  operations  done  by  the  cpu. 


ch[4] 
VAR  wbufferO 
VAR  wbufferl 


maxwordblock. size  +  1 
maxwordblock. size  +  1 


SEQ 

--  initialize  buffers 
SEQ  i  =  [1  FOR  maxwordblock. size] 
SEQ 

wbufferOfi]  :=  i 
wbufferl[i]  :=  i 
SKIP 

SEO  i  =  [0  FOR  nr. of. sizes] 

If 

sizetable[i]  <  4 

SKIP minimum  number  of  bytes  is  4  for  integer  transfer 

TRUE 
SEQ 

Dlock.size  :=  sizetableFi] 
--  input  and  output  handling 
--  input  from  two  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

WORD. SLICE. INPUT ( in, wbufferO,l, (block. size/4) ) 
SKIP 

--  input/output  to  two  links 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 
PAR 

WORD. SLICE. INPUT (in, WbufferO,!, (block. size/4)) 
WORD. SLICE. OUTPUT (out, wbufferl,l, (block. size/4) ) 
SKIP 

--  input  from  three  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

WORD . SLICE . INPUT (in, wbufferO , 1 , (block. size/4) ) 
SKIP 

--  input/output  to  three  links 
SEQ  j  =  [1  FOR  repetition] 
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SEQ 

out  !  'a' 
PAR 

WORD.SLICE.INPUT(in,wbufferO,l, (block. size/4) ) 
WORD. SLICE. OUTPUT ( out, wbufferl,!, (block. size/4) ) 
SKIP 

--  input  from  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

WORD . SLICE . INPUT ( in , wbuf f erO , 1 , (block . size/4) ) 
SKIP 

--  output  to  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

WORD.SLICE.OUTPUT(out,wbufferO,l, (block. size/4) ) 
SKIP 

--  all  output  and  input  in  parallel 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 
PAR 

WORD. SLICE. INPUT (in, wbuf ferO,l, (block. size/4)) 
WORD. SLICE. OUTPUT (out, wbuf ferl,l, (block. size/4) ) 
SKIP 

SKIP 
IF 

--  cpumode  not  =  '0'  then  get  the  number  of  computations  done, 
cpumode  <>  '0' 
SEQ 

done  !  ' a ' 
opnumber  ?  number 

TRUE 

SKIP   : 

--  main  PROC  word. slice .transferl 
IF 

((cpumode  =  '1')  OR  (cpumode  =  '2')) 
PAR 

v;ordtransferl  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '3')  OR  (cpumode  =  '4')) 
PRI  PAR 

wordtransferl  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '5')  OR  (cpumode  =  '6')) 
PAR 

wordtransferl  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
((cpumode  =  '7')  OR  (cpumode  =  '8')) 
PRI  PAR 

wordtransferl  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
TRUE 

wordtransferl  (repetition,  cpumode,  flag,  counter): 

--  procedure  body  transferl .b003 
SEQ 

answer  :=  'Y' 

WHILE   answer  =  'Y' 

ssq 

m  ?  ch 

in  ?  cpumode 

in  ?  repetition 

IF 

ch  =  'A' 
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inout. transferl  (repetition, cpumode) 
ch  =  'B' 

byte. slice. transferl  (repetition, cpumode) 
ch  =  ' I ' 

int. transferl  (repetition, cpumode) 
ch  =  'W 

word. slice. transferl  (repetition, cpumode) 
TRUE 

SKIP 
in  ?  answer  : 

END  OF  CODE  IN  TRANSPUTER  1  B003 
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--  TRANSPUTER2_B003.TDS 

--  SC  PRCC  transfer2.b003 
--  PROC  transfer2.b003  (CHAN  in, out) 
PROC  transferZ.bOOS  (CHAN  in, out)  = 
--  description 

This  is  the  outer  procedure  placed  on  transputer  2  .  It  contains 

global  variables  and  constants,  and  all  procedures  that  run  in  this 

transputer.  It  receives  a  construct  type  (ch) ,  cpu  mode  (cpumode), 

and  number  of  times  each  communication  sequence  (repetition),  and 

calls  accordingly  one  of  the  following  procedures: 

-  io . transfer2 , 

-  byte. slice . transfer2 , 

-  int. transfer2  or 

-  v;ord.  slice  .  transfer2 

--   Link  Definitions 

DEF  linkOin  =  4 

DEF  linkOout  =  0 

DEF  Imklin  =  5 

DEF  linklout  =  1 

DEF  link2in  =  6 

DEF  link2out  =  2 

DEF  linkSin  =  7 

DEF  linkSout  =  3 

--  constant  declarations 

DEF  sizetable  =  TABLE  [  1,  2,  4,  8,  16,  32,  64,  128,  256,  512, 

1024,  1280,  2048,  4096,  8192.  10000  ]: 

DEF  nr. of. sizes  =  16:        as  counted  from  above  table 

DEF  maxblock.size  =  10000:   last  from  the  above  table 

DEF  maxwordblock.size  =  maxblock. size/4: 

--  variable  declarations 

VAR  ch        :  choice  of  the  user  in  type  of  construct 

VAR  answer    :  choice  of  the  user  in  continue 

VAR  cpumode    :  choice  of  the  user  in  cpu  operation  concurrently 

VAR  repetition:  choice  of  the  user  in  number  of  times  to  run 

--  SC  PROC  cpubusysum  (CHAN  f lagl ,  counterchan) 
--  PROC  cpubusysum  (CHAN  f lagl , counterchan) 
PROC  cpubusysum  (CHAN  f lagl ,counterchan)= 
--  description 

It  keeps  the  cpu  working  in  parallel  (time  sharing)  with  link 

transfers  by  doing  sum  operations  .  It  Stops  when  it  receives 

a  flag  by  the  channel  flaql  from  the  transfer  procedure  that  is 

being  executed  concurrently.  It  Outputs  by  channel  counterchan 

the  number  of  operations  done. 

VAR  a,b,e/ 

working, 
counter, 
ch  : 

SEQ 

counter  :=  0 
working  :=  TRUE 
TIME  ?  a 
WHILE  working 
ALT 

flagl  ?  ch 

working  :=  FALSE 
TIME  ?  b 
SEQ 

e  :=  a  +  b 

counter  :=  counter  +  1 
counterchan  !  counter: 
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--  descriptor 

--  code 

--  SC  PROC  cpubusyprod  (CHAN  flagl ,counterchan) 
--  PROC  cpubusyprod  (CHAN  flagl ,counterchan) 
PROC  cpubusyprod  (CHAN  flagl ,counterchan)= 
--  description 

It  keeps  the  cpu  working  in  parallel(time  sharing)  with  the  link 

transfers  by  doing  array  multiplications.  It  stops  when  receives 

a  flag  by  the  channel  flagl  from  the  transfer  procedure,  that  is 

being  executed  concurrently.  It  outputs  by  channel  counterchan 

the  number  of  operations  done. 

--   constants  and  variable  declarations 


DEF  number  =  100 
VAR  a [number  +  1]^ 

b [number  +  1 

e [number  +  1 

clock, 

working, 

counter, 

ch  : 


size  of  array 

array  of  integers 
array  of  integers 
array  of  integers 
integer  -variable  to  get  time 
boolean  -to  stop  execution 
integer  -number  of  operations  done 


SEQ 


-  initialize  buffers  and  variables 
SEQ  i  =  [  1  FOR  number  ] 


b[i 

=  3*i 

=  5*i 

SKIP 

counter  := 

0 

working 

•  = 

TRUE 

WHILE  working 
ALT 

flagl  ?  ch 

working  :=  FALSE 
TIME  ?  clock 
SEQ 

SEQ  i  =  [1  FOR  number] 
e[i]  :=  a[i]  -^   b[i] 
counter  :=  counter  +  number 
counterchan  !  counter: 


■--updates  nr.  of  operations 
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--  PROC  inout. transfer2  (VALUE  repetition, cpumode) 
PROC  inout. transfer2  (VALUE  repetition, cpumode)= 
--  description 

Initializes  the  buffers  and  executes  the  procedures  iotransfer, 

plus,  when  applicable  according  to  cpumode,  one  of  the  following: 

"    cpubusy.prod  or  cpubusy. sum. 

Uses  global  constant  maxblock. size 

CHAN  flag,     flags  the  cpu  to  stop 

counter  :  return  the  number  of  operations  cpu  did 

--  PROC  iotransfer2  (VALUE  repetition,  cpumode,  CHAN  done,  counter) 
PROC  iotransferZ  (VALUE  repetition, cpumode , CHAN  done,  opnumber)= 

--  variable  declarations 
VAR  block. size, 

number,   the  number  of  operations  done  by  the  cpu. 


ch[4] : 
VAR  bufferO 
VAR  buffer: 


BYTE  maxblock. size  +  1 
BYTE  maxblock. size  +  1 


SEQ 

--  initialize  buffers 
SEQ  i  =  [1  FOR  maxblock. size] 
SEQ 

bufferO-BYTE  i-  :=  i\8 
bufferl-BYTE  i-  :=  i\8 
SKIP 

SEQ  i  =  [0  FOR  nr. of. sizes] 
SEO 

"block. size  :=  sizetablefi] 
--  input  and  output  handling 
--  input  from  three  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  block. size] 
m  ?  bufferO[BYTE  k] 
SKIP 

--  output  to  three  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  block. size] 
PAR 

in  ?  bufferO[BYTE  kj 
out  !  bufferl[BYTE  k] 
SKIP 

--  input  from  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  block. size] 
m  ?  bufferO[BYTE  k] 
SKIP 

--  output  to  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  block. size] 
out  !  bufferO[BYTE  k] 
SKIP 

--  all  output  and  input  in  parallel 
SEQ  j  =  [1  FOR  repetition] 
SEQ 
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out  !  'a' 

SEQ  k  =  [1  FOR  block. size] 
l-AR 

in  ?  bufferO[BYTE  kj 
out  !  bufferl[BYTE  k] 
SKIP 

SKIP 
IF 

--  cpumode  NOT  =  '0'  then  get  the  number  of  computations  done 
cpumode  <>  '0' 
SEQ 

done  !  'a' 
opnumber  ?  number 

TRUE 

SKIP   : 

--  main  inout . transfer2 
IF 

((cpumode  =  '1')  OR  (cpumode  =  '2')) 
PAR 

iotransferZ  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '3')  OR  (cpumode  =  '4')) 
PRI  PAR 

iotransfer2  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '5')  OR  (cpumode  =  '6')) 
PAR 

iotransfer2  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
((cpumode  =  '7')  OR  (cpumode  =  '8')) 
PRI  PAR 

iotransfer2  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
TRUE 

iotransfer2  (repetition,  cpumode,  flag,  counter): 
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--  PROC  byte. slice. transfer2  (VALUE  repetition, cpumode) 
PROC  byte. slice. transferZ  (VALUE  repetition, cpumode)= 
--  description 

Initializes  the  buffers  and  executes  the  procedures  iotransfer, 

plus,  when  applicable  according  to  cpumode,  one  of  the  following: 

cpubusy.prod  or  cpubusy.sum. 

Uses  global  constant  maxblock.size 

CHAN  flag,     flags  the  cpu  to  stop 

counter  :  return  the  number  of  operations  cpu  did 

--  PROC  transfer2  (VALUE  repetition,  cpumode,  CHAN  done,  counter) 
PROC  transferZ  (VALUE  repetition, cpumode, CHAN  done,  opnumber)= 

--  variable  declarations 
VAR  block. size, 

number,   the  number  of  operations  done  by  the  cpu. 

ch[_4]: 
VAR  bufferO  [BYTE  maxblock.size  +  11: 
VAR  bufferl  [BYTE  maxblock.size  +  1]  : 

SEQ 

--  initialize  buffers 
SEQ  i  =  [1  FOR  maxblock.size] 
SEQ 

bufferO-BYTE  i-  :=  i\8 
bufferl-BYTE  i-  :=  i\8 
SKIP 

SEQ  i  =  [0  FOR  nr. of. sizes] 
SEQ 

block. size  :=  sizetable[i] 
--  inout  from  three  channels 
SEQ  j* =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

BYTE . SLICE . INPUT ( in , buf f erO , 1 , block . size ) 
SKIP 

--  input/output  to  three  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 
PAR 

BYTE. SLICE. INPUT (in, buf ferO,l, block. size) 
BYTE . SLICE . OUTPUT ( out , bufferl , 1 , block . s  ize ) 
SKIP 

--  input  from  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

BYTE. SLICE. INPUT (in, buf ferO,l, block. size) 
SKIP 

--  output  to  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

BYTE. SLICE. OUTPUT (out, buf ferO,l, block. size) 
SKIP 

--  all  output  and  input  in  parallel 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 
PAR 

BYTE . SLI CE . INPUT ( in , buf f e  rO , 1 , block . size ) 
BYTE . SLICE .OUTPUT(out , buf ferl , 1 , block. size) 
SKIP 
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SKIP 
IF 

--  cpumode  NOT  =  '0'  then  get  the  number  of  computations  done. 
cpumode  <>  '0' 
SEQ 

done  !  ' a ' 
opnumber  ?  number 

TRUE 

SKIP   : 

--  main  byte .slice .transfer2 
IF 

((cpumode  =  '1')  OR  (cpumode  =  '2')) 
PAR 

transferZ  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '3')  OR  (cpumode  =  '4')) 
PRI  PAR 

transfer2  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '5')  OR  (cpumode  =  '6')) 
PAR 

transfer2  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
((cpumode  =  '7')  OR  (cpumode  =  '8')) 
PRI  PAR 

transfer2  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
TRUE 

transfer2  (repetition,  cpumode,  flag,  counter): 
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--  PROC  int.transferZ  (VALUE  repetition, cpumode) 
PROC  int. transfer2  (VALUE  repetition, cpumode)= 
--  description 

Initializes  the  buffers  and  executes  the  procedures  intransfer, 

plus,  when  applicable  according  to  cpumode,  one  of  the  following: 

cpubusy.prod  or  cpubusy.sum. 

Uses  global  constant  maxblock.size 

CHAN  flag,     flags  the  cpu  to  stop 

counter  : return  the'  number  of  operations  cpu  did 

--  PROC  intransfer2  (VALUE  repetition,  cpumode,  CHAN  done,...) 
PROC  intransfer2  (VALUE  repetition, cpumode , CHAN  done,  opnumber)= 
--  description. io 
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It  executes  sequentially  several  parallel  transfers  of  integers 

using  the  input/output  primitives  to/from  transputer  root. 

It  uses  the  global  constants:   sizetable,  nr. of. sizes, 

repetition 

--   variable  declarations 
VAR  block. size, 

number,   —  the  number  of  operations  done  by  the  cpu. 


ch[4] 
VAR  wbufferO 
VAR  wbufferl 


maxwordblock.size  +  1 
maxwordblock.size  +  1 


SEQ 

--  initialize  buffers 
SEQ  i  =  [1  FOR  maxwordblock.size] 
SEQ 

v/bufferO[i]  :=  i 

wbufferl [ij  :=  i 
SKIP 

SEQ  i  =  [0  FOR  nr. of. sizes] 
IF 

sizetable[i]  <  4 

SKIP  minimum  number  of  bytes  is  4  for  integer  transfer 

TRUE 
SEQ 

"block. size  :=  sizetableTi] 
--  input  and  output  handling 
--  input  from  three  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  (block. size/4)] 
m  ?  wbufferOlk] 
SKIP 

--  input/output  to  three  links 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  (block. size/4)] 
PAR 

in  ?  wbufferO [k] 
out  !  wbufferl [k] 
SKIP 

--  input  from  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  (.block. size/4)] 
m  ?  wbufferO[k] 
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SKIP 

--  output  to  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  (block. size/4)] 
out  !  wbufferO[k] 
SKIP 

--  all  output  and  input  in  parallel 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  (block. size/4)] 
PAR 

in  ?  wbufferOfk] 
out  !  wbufferl[k] 
SKIP 

SKIP 
IF 

--  cpumode  NOT  =  '0'  then  get  the  number  of  computations  done 
cpumode  <>  '0' 
SEQ 

done  !  ' a ' 
opnumber  ?  number 

TRUE 

SKIP   : 

--  main  int. transferZ 
IF 

((cpumode  =  '1')  OR  (cpumode  =  '2')) 
PAR 

intransfer2  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '3')  OR  (cpumode  =  '4')) 
PRI  PAR 

intransfer2  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '5')  OR  (cpumode  =  '6')) 
PAR 

intransfer2  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
((cpumode  =  '7')  OR  (cpumode  =  '8')) 
PRI  PAR 

intransfer2  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
TRUE 

intransfer2  (repetition,  cpumode,  flag,  counter): 
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--  PROC  word. slice. transfer2  (VALUE  repetition, cpumode) 
PROC  word. slice. transfer2  (VALUE  repetition, cpumode)= 
--  description 

Initializes  the  buffers  and  executes  the  procedures  wordtransfer , 

plus,  when  applicable  according  to  cpumode,  one  of  the  following: 

cpubusy.prod  or  cpubusy . sum. 

Uses  global  constant  maxblock. size 
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CHAN  flag,     flags  the  cpu  to  stop 

counter  :  return  the  number  of  operations  cpu  did 

--  PROC  wordtransferZ  (VALUE  repetition,  cpumode,  CHAN  done,  ...) 
PROC  v;ordtransfer2  (VALUE  repetition,  cpumode ,  CHAN  done,  opnumber)  = 

--  variable  declarations 
VAR  block. size , 

number,   the  number  of  operations  done  by  the  cpu. 

chf4]  : 
VAR  wbufferO  [   maxwordblock. size  +  1 


VAR  wbufferl 
SEC 


maxwordblock. size  +  1 


-  initialize  buffers 
SEQ  i  =  [1  FOR  maxwordblock. size] 
SEQ 

wbufferO [i]  :=  i 
wbufferl[i]  :=  i 
SKIP 

SEQ  i  =  [0  FOR  nr. of. sizes] 
IF 

sizetable[i]  <  4 

SKIP  — minimum  number  of  bytes  is  4  for  integer  transfer 
TRUE 
SEQ 

Block. size  :=  sizetableTi] 
--  input  and  output  handling 
--  input  from  three  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

WORD . SLICE . INPUT ( in , wbufferO , 1 , (block .size/4 ) ) 
SKIP 

--  input/output  to  three  links 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 
PAR 
•  WORD . SLICE . INPUT ( in , wbufferO , 1 , (block .size/4 ) ) 
WORD. SLICE. OUTPUT (out, wbufferl,l, (block. size/4) ) 
SKIP 

--  input  from  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

WORD .SLICE . INPUT (in, WbufferO , 1 , (block. size/4) ) 
SKIP 

--  output  to  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

WORD. SLICE. OUTPUT (out, wbufferO,l, (block. size/4)) 
SKIP 

--  all  output  and  input  in  parallel 
SEQ  j  =  [1  FOR  repetition] 
SEQ 
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out  !  'a' 

PAR 

WORD. SLICE. INPUT ( in, wbufferO,!, (block. size/4)) 
WORD. SLICE .OUTPUT (out , whuff erl , 1 , (block. si2e/4) ) 
SKIP 

SKIP 
IF 

--  cpumode  NOT  =  '0'  then  get  the  number  of  computations  done. 
CDumode  <>  'C 
SEQ 

done  !  'a' 
opnumber  ?  number 

TRUE 

SKIP   : 

--  main  v;ord.  slice.  transfer2 
IF 

((cpumode  =  '1')  OR  (cpumode  =  '2')) 
PAR 

wordtransfer2  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '3')  OR  (cpumode  =  '4')) 
PRI  PAR 

wordtransfer2  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '5')  OR  (cpumode  =  '6')) 
PAR 

wordtransfer2  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
((cpumode  =  '7')  OR  (cpumode  =  '8')) 
PRI  PAR 

wordtransfer2  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
TRUE 

wordtransfer2  (repetition,  cpumode,  flag,  counter): 

--  procedure  body  transfer2.b003 
SEQ 

answer  :=  'Y' 
WHILE   answer  =  'Y' 
SEQ 

m  ?  ch 

in  ?  cpumode 

in  ?  repetition 

IF 

ch  =  'A' 

inout . transf er2  ( repetition , cpumode ) 
ch  =  'B' 

byte. si ice. trans fer2  (repetition, cpumode) 
ch  =  'I' 

int. transfer2  (repetition, cpumode) 
ch  =  'W' 

v;ord.  slice .  transfer2  (repetition, cpumode) 
TRUE 
SKIP 
in  ?  answer   : 

■-   END  OF  CODE  IN  TRANSPUTER  2 
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--  TRANS?UTER3_B003.TDS 

--  SC  PROC  transfers. b003 
--  PROC  transfers. boos  (CHAN  in, out) 
PROC  transfers. boos  (CHAN  in, out)  = 
--  description 

This  is  the  outer  procedure  placed  on  transputer  3  .  It  contains 

global  variables  and  constants,  and  all  procedures  that  run  in  this 

transputer.  It  receives  a  construct  type  (ch),  cpu  mode  (cpumode), 

and  number  of  times  each  communication  sequence  (repetition),  and 

calls  accordingly  one  of  the  following  procedures: 

-  io. transfers , 

-  byte .slice . transfers , 

-  int. transfers  or 

-  word. slice. transfers 

--  Link  Definitions 

DEF  linkOin  =  4 

DEF  linkOout  =  0 

DEF  linklin  =  5 

DEF  linklout  =  1 

DEF  link2in  =  6 

DEF  link2out  =  2 

DEF  linkSin  =  7 

DEF  link3out  =  3 

--  constant  declarations 

DEF  sizetable  =  TABLE  [  1,  2,  4,  8,  16,  32,  64,  128,  256,  512, 

1024,  1280,   2048,  4096,  8192,  10000]: 

DEF  nr, of. sizes  =  16:        as  counted  from  above  table 

DEF  maxblock.size  =  10000:   last  from  the  above  table 

DEF  maxwordblock.size  =  maxblock. size/4 : 

--  variable  declarations 

VAR  ch       :   —  choice  of  the  user  in  type  of  construct 

VAR  answer    :    choice  of  the  user  in  continue 

VAR  cpumode    :   choice  of  the  user  in  cpu  operation  concurrently 

VAR  repetition:   choice  of  the  user  in  number  of  times  to  run 

—  SC  PROC  cpubusysum  (CHAN  f lagl ,  counterchan) 
--  PROC  cpubusysum  (CHAN  f lagl , counterchan) 
PROC  cpubusysum  (CHAN  f lagl , counterchan) = 
--  description. cpu 

It  keeps  the  cpu  working  in  parallel  (time  sharing)  with  link 

transfers  by  doing  sum  operations  .  It  Stops  when  it  receives 

a  flag  by  the  channel  flagl  from  the  transfer  procedure  that  is 

being  executed  concurrently.  It  Outputs  by  channel  counterchan 

the  number  of  operations  done. 

VAR  a,b,e, 

working, 
counter, 
ch  : 

SEQ 

counter  :=  0 
working  :=  TRUE 
TIME  ?  a 
WHILE  working 
ALT 

flagl  ?  ch 

working  :=  FALSE 
TIME  ?  b 
SEQ 

e  :=  a  +  b 

counter  :=  counter  +  1 
counterchan  !  counter: 
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--  descriptor 

--  code 

--  SC  PROC  CDubusyprod  (CHAN  flagl ,counterchan) 
--  PROC  cpub'usyprod  (CHAN  flagl ,  counterchan) 
PROC  cpubusyprod  (CHAN  flagl , counterchan)= 
--  description 

It  keeps  the  cpu  working  in  parallel(time  sharing)  with  the  link 

transfers  by  doing  array  multiplications.  It  stops  when  receives 

a  flag  by  the  channel  flagl  from  the  transfer  procedure,  that  is 

being  executed  concurrently.  It  outputs  by  channel  counterchan 

the  number  of  operations  done. 

--  constants  and  variable  declarations 

DEF  number  =  100:  size  of  array 

VAR  a[number  +  1],  array  of  integers 

b[number  +  it,  array  of  integers 

e 'number  +  1],  array  of  integers 

clock,  integer  -variable  to  get  time 

working,  boolean  -to  stop  execution 

counter,  integer  -number  of  operations  done 

ch  : 


SEQ 


-  initialize  buffers  and  variables 
SEQ  i  =  [  1  FOR  number  ] 
SEQ 

a[i]  :=  3*i 
b[ij  :=  5*i 
SKIP 

counter  :=  0 
working  :=  TRUE 

WHILE  working 
ALT 

flagl  ?  ch 

working  :=  FALSE 
TIME  ?  clock 
SEQ 

SEQ  i  =  [1  FOR  number] 

e[i]  :=  a[i]  *  b[i]   ^        ,         ,       . 

counter  :=  counter  +  number  updates  nr.  of  operations 

counterchan  !  counter: 
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--  PROC  inout. transfers  (VALUE  repetition, cpumode) 
PROC  inout. transfers  (VALUE  repetition, cpumode)= 
--  description 

Initializes  the  buffers  and  executes  the  procedures  iotransfer, 

plus,  v;hen  applicable  according  to  cpumode,  one  of  the  following: 

cpubusy.prod  or  cpubusy.sum. 

Uses  global  constant  maxblock. size 

CHAN  flag,     flags  the  cpu  to  stop 

counter  :  return  the  number  of  operations  cpu  did 

--  PROC  iotransferS  (VALUE  repetition,  cpumode,  CHAN  done,...) 
PROC  iotransferS  (VALUE  repetition, cpumode , CHAN  done,  opnumber)= 
--  description 

It  executes  sequentially  several  parallel  transfers  of  bytes 

using  the  input/output  primitives  to/from  transputer  root. 

It  uses  the  global  constants:   sizetable,  nr. of. sizes, 

repetition 

--   variable  declarations 
VAR  block. size, 

number,   —  the  number  of  operations  done  by  the  cpu. 


ch[4] : 
;ufferO 


VAR  bufferO 
VAR  bufferl 


BYTE  maxblock. size  +  1" 
BYTE  maxblock. size  +  1 


SEQ 

--  initialize  buffers 
SEQ  i  =  [1  FOR  maxblock. size] 
SEQ 

bufferO-BYTE  i-  :=  i\8 
bufferl-BYTE  i-  :=  i\8 
SKIP 

SEQ  i  =  [0  FOR  nr. of. sizes] 
SEQ 

Llock.size  :=  sizetable[i] 
--  input  from  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  block. size] 
in  ?  bufferO[BYTE  k] 
SKIP 

--  output  to  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  block. size] 
out  !  bufferO[BYTE  k] 
SKIP 

--  all  output  and  input  in  parallel 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  block. size] 
PAR 

in  ?  bufferO[BYTE  k] 
out  !  bufferl [BYTE  k] 
SKIP 

SKIP 
IF 

--  cpumode  not='0'  then  get  the  number  of  computations  done 

cpumode  <>  ' 0 ' 
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SEQ 

done  !  ' a ' 
opnumber  ?  number 

TRUE 

SKIP   : 

--  main  inout. transfers 
IF 

((cDumode  =  '1')  OR  (cpumode  =  '2')) 
PAR 

iotransferS  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '3')  OR  (cpumode  =  '4')) 
PRI  PAR 

iotransferS  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '5')  OR  (cDumode  =  '6')) 
PAR 

iotransferS  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
((cpumode  =  '7')  OR  (cpumode  =  '8')) 
PRI  PAR 

iotransferS  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
TRUE 

iotransferS  (repetition,  cpumode,  flag,  counter): 
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--  PROC  byte. slice. transfers  (VALUE  repetition, cpumode) 
PROC  byte. slice. transfers  (VALUE  repetition, cpumode)= 
--  description 

Initializes  the  buffers  and  executes  the  procedures  transfer, 

plus,  when  applicable  according  to  cpumode,  one  of  the  following: 

cpubusy.prod  or  cpubusy.sum. 

Uses  global  constant  maxblock. size 

CHAN  flag,     flags  the  cpu  to  stop 

counter  :  return  the  number  of  operations  cpu  did 

--  PROC  transfers  (VALUE  repetition,  cpumode,  CHAN  done,  ...) 
PROC  transfers  (VALUE  repetition, cpumode , CHAN  done,  opnumber)= 
--  description 

It  executes  sequentially  several  parallel  transfers  of  BYTES 

using  the  BYTE. SLICE  procedures  to/from  transputer  root. 

It  uses  the  global  constants:   sizetable,  nr. of. sizes, 

repetition 

--   variable  declarations 
VAR  block. size, 

number,   the  number  of  operations  done  by  the  cpu. 


ch[4] : 
VAR  bufferO 
VAR  bufferl 


BYTE  maxblock. size  +  1 
BYTE  maxblock. size  +  1 


SEQ 

--  initialize  buffers 
SEQ  i  =  [1  FOR  maxblock. size] 
SEQ 

bufferO-BYTE  i-  :=  i\8 
bufferl-BYTE  i-  :=  i\8 
SKIP 

SEQ  i  =  [0  FOR  nr. of. sizes] 
SEQ 

block. size  :=  sizetable[i] 
--  input  from  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

BYTE. SLICE. INPUT ( in, bufferO,l, block. size) 
SKIP 

--  output  to  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

BYTE. SLICE. OUTPUT ( out, bufferO,!, block. size) 
SKIP 

--  all  output  and  input  in  parallel 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 
PAR 

BYTE .SLICE . INPUT ( in , bufferO , 1 , block . size ) 
BYTE. SLICE. OUTPUT (out, bufferl,l, block. size) 
SKIP 

SKIP 
IF 

--  cpumode  not='0'  then  get  the  number  of  computations  done 
cpumode  <>  '0' 
SEQ 

done  !  ' a ' 
opnumber  ?  number 
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TRUE 

SKIP   : 

--  main  byte . slice . transfers 
IF 

((cpumode  =  '1')  OR  (cDumode  =  '2')) 
PAR 

transfers  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '3')  OR  (cpumode  =  '4')) 
PRI  PAR 

transfers  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '5')  OR  {cpumode  =  '6')) 
PAR 

transfers  (repetition,  cpumode,  flag,  counter) 
cpubusyproa  (flag,  counter) 
((cpumode  =  '7')  OR  (cpumode  =  '8')) 
PRI  PAR 

transfers  (repetition,  cpumode,  flag,  counter) 
cpubusyproa  (flag,  counter) 
TRUE 

transfers  (repetition,  cpumode,  flag,  counter): 
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--  PROC  int. transfers  (VALUE  repetition, cpumode) 
PROC  int. transfers  (VALUE  repetition, cpumode )= 
--  description 

Initializes  the  buffers  and  executes  the  procedures  intransfer, 

plus,  v/hen  applicable  according  to  cpumode,  one  of  the  follov/ing: 


cpubusy.prod  or  cpubusy.sum. 
;  global  cpnstant  maxblock.si: 


Uses  global  constant  maxblock. size 

CHAN  flag,     flags  the  cpu  to  stop 

counter  :  return  the  number  of  operations  cpu  did 

--  PROC  intransferS  (VALUE  repetition,  cpumode,  CHAN  done,  .. 
PROC  intransferS  (VALUE  repetition, cpumode , CHAN  done,  opnumber] 
--  description 

It  executes  sequentially  several  parallel  transfers  of  integers 

using  the  input/output  primitives  to/from  transputer  root. 

It  uses  the  global  constants:   sizetable,  nr. of. sizes, 

repetition 

--  variable  declarations 
VAR  block. size , 

number,   —  the  number  of  operations  done  by  the  cpu. 


ch[4 
VAR  wbufferO 
VAR  wbufferl 


maxwordblock.size  +  1 
maxwordblock.size  +  1 


SEQ 

--  initialize  buffers 
SEQ  i  =  [1  FOR  maxwordblock.size] 
SEQ 

WbufferO [BYTE  i]  :=  i 

wbufferl [BYTE  i]  :=  i 
SKIP 

SEQ  i  =  [0  FOR  nr. of. sizes] 

sizetable[i]  <  4 

SKIP  minimum  number  of  bytes  is  4  for  integer  transfer 

TRUE 
SEQ 

Dlock.size  :=  sizetable[i 
--  input  from  four  channels 
SEQ  j  =  [1  FOR  repetition' 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  (.block. si2e/4)] 
in  ?  wbufferOLk] 
SKIP 

--  output  to  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  (block. size/4)] 
out  !  wbufferO [k] 
SKIP 

--  all  output  and  input  in  parallel 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

SEQ  k  =  [1  FOR  (block. si2e/4)] 
PAR 

in  ?  wbufferO [kl 
out  !  wbufferl [k] 
SKIP 
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SKIP 

IF 

--  cpumode  not='0'  then  get  the  number  of  computations  done 
cpumode  <>  '0' 
SEQ 

done  !  'a' 
opnumber  ?  number 

TRUE 

SKIP   : 

--  main  int. transfers 
IF 

((cpumode  =  '1')  OR  (cpumode  =  '2')) 
PAR 

intransferS  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '3')  OR  (cpumode  =  '4')) 
PRI  PAR 

intransferS  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '5')  OR  (cpumode  =  '6')) 
PAR 

intransferS  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '7')  OR  (cpumode  =  '8')) 
PRI  PAR 

intransferS  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
TRUE 

intransferS  (repetition,  cpumode,  flag,  counter)  : 
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--  PROC  word. slice. transfers  (VALUE  repetition, cpumode) 
PROC  word. slice. transfers  (VALUE  repetition, cpumode)= 
--  description 

Initializes  the  buffers  and  executes  the  procedures  wordtransfer , 

plus,  when  applicable  according  to  cpumode,  one  of  the  following: 

cpubusy.prod  or  cpubusy.sum. 

Uses  glpbal  constant  maxblock. size 

_  —  —  iK7^■k7<i^7K7^7^■k^k7^^k■k7^■kyKirir7^^k■k■k■k^k■k■k■^;^^^k^ki^i^■k■k^ki^^k^k^k^k9riK^k7^ 

CHAN  flag,     flags  the  cpu  to  stop 

counter  :  return  the  number  of  operations  cpu  did 

--  PROC  wordtransferS  (VALUE  repetition,  cpumode,  CHAN  done,...) 
PROC  wordtransferS  (VALUE  repetition, cpumode , CHAN  done,  opnumber)= 
--  description 

It  executes  sequentially  several  parallel  transfers  of  integers 

using  the  WORD. SLICE  procedures  to/from  transputer  root. 

It  uses  the  global  constants:   sizetable,  nr. of. sizes, 

repetition 

--   variable  declarations 
VAR  block. size, 

number,   the  number  of  operations  done  by  the  cpu. 

ch[4] : 


VAR  wbufferO 
VAR  wbufferl 


maxwordblock.size  +  1 
maxv;ordblock.size  +   1 


SEQ 

--  initialize  buffers 
SEQ  i  =  [1  FOR  maxwordblock.size] 
SZQ 

v;bufferO[BYTE  i]  :=  i 

wbufferl [BYTE  i]  :=  i 
SKIP 

SEQ  i  =  [0  FOR  nr. of. sizes] 

sizetable[i]  <  4 

SKIP — minimum  number  of  bytes  is  4  for  integer  transfer 
TRUE 
SEQ 

clock. size  :=  sizetableTi] 
--  input  and  output  handling 
--  input  from  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ. 

out  !  'a' 

WORD. SLICE. INPUT( in, WbufferO,!, (block. size/4)) 
SKIP 

--  output  to  four  channels 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 

WORD.  SLICE.  OUTPUT  (out,  wbufferCl,  (block,  size/4) ) 
SKIP 

--  all  output  and  input  in  parallel 
SEQ  j  =  [1  FOR  repetition] 
SEQ 

out  !  'a' 
PAR 

V70RD .  SLICE .  INPUT  (in,  wbufferO ,  1 ,  (block  .size/4) ) 
WORD. SLICE. OUTPUT (out, wbufferl,!, (block. size/4) ) 
SKIP 

SKIP 
IF 
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--  cpumode  not='0'  then  get  the  number  of  computations  done, 
cpumode  <>  '0' 
SEQ 

done  !  ' a ' 

opnumber  ?  number 

TRUE 

SKIP   : 

--  main  word. slice . transfers 
IF 

((cpumode  =  '1')  OR  (cpumode  =  '2')) 
PAR 

wordtransferS  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '3')  OR  (cpumode  =  '4')) 
PRI  PAR 

wordtransferS  (repetition,  cpumode,  flag,  counter) 
cpubusysum  (flag,  counter) 
((cpumode  =  '5')  OR  (cpumode  =  '6')) 
PAR 

wordtransferS  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
((cpumode  =  '7')  OR  (cpumode  =  '8')) 
PRI  PAR 

wordtransferS  (repetition,  cpumode,  flag,  counter) 
cpubusyprod  (flag,  counter) 
TRUE 

wordtransferS  (repetition,  cpumode,  flag,  counter): 

--  main  transfers .bOOS 
SEQ 

ansv;er  :=  'Y' 
WHILE   answer  =  'Y' 
SEQ 

in  ?  ch 

in  ?  cpumode 

in  ?  repetition 

IF 

ch  =  'A' 

inout . transfers  ( repetition , cpumode ) 
ch  =  'B' 

byte .slice . transfers  (repetition, cpumode) 
ch  =  'I' 

int . transfers  ( repetition , cpumode ) 
ch  =  'W 

word. slice. transfers  (repetition, cpumode) 
TRUE 
SKIP 
in  ?  answer    :       ' 
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--  configuration 

--  Link  Definitions 
DEF  linkOin  =  4 
DEF  linkOout  =  0 
DEF  linklin  =  5 
DEF  linklout  =  1 
DEF  link2in  =  6 
DEF  link2out  =  2 
DEF  link3in  =  7 
DEF  link3out  =  3 

DEF  root  =  100: 

CHAN  pipein[4] ,pipeout[4] 

PLACED  PAR 

--  PROCESSOR  ROOT 

PROCESSOR  root 
PLACE  pipein[0] 
PLACE  DipeoutrO] 
PLACE  pipein[l] 


PLACE  pipeoutrl] 
PLACE  pipein[21 
PLACE  pipeoutfZ] 
PLACE  pipein[3l 
PLACE  pipeout[3] 


linkOin 
linkOout 
linklin 
linklout 
link2in 
AT  link2out 
AT  link3in 
AT  linkSout 


AT 
AT 
AT 
AT 
AT 


hostproc  (pipein[0] ,pipein[l] ,pipein[2] ,pipein[3] , 

pipeout[0] ,pipeout[l] ,pipeout[2] ,pipeout[3] ) 


--  PROCESSOR  0 
PROCESSOR  0 

PLACE  pipein[0] 


AT  linkOout 
AT  linkOin 


PLACE  pipeout[0] 

transferO.bOOS  (pipeout[0] ,pipein[0] ) 


--  PROCESSOR  1 
PROCESSOR  1 

PLACE  pipein[l] 

■   itri 


AT  linkOout 
AT  linkOin 


PLACE  pipeout[l] 

transferl .b003  (pipeout[l] ,pipein[l] ) 


--  PROCESSOR  2 

PROCESSOR  2 

PLACE  pipein[2l    AT  linkOout  : 
PLACE  pipeout[2]   AT  linkOin   : 

transfer2.b003  (pipe-out[2]  ,pipein[2] ) 


--  PROCESSOR  3 
PROCESSOR  3 

PLACE  pipein[3l 


AT  linkOout 
AT  linkOin 


PLACE  pipeout[3] 

trans fer3 .b003  (pipeout[3] , pipe in [3] ) 
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APPENDIX  F 
PROGRAM  TEST  LLNEARITY 


header. occ 

*   Title  :   Test  Performance  Linearity 

Version  :  2 

Mod  :   0 

Author  :   Jose  Vanni  Filho,  Lcdr, ,  Brazilian  Navy 

Date  :   June,  5th, 1987 

Programming  Language  :  OCCAM  1 

Compiler  :  IMS  D  600  -  TDS 

Brief  Description  :   This  program  mapped  in  17 
transputers  shows  us  the  capability  of  the 
transputer  in  linear  increase  of  performance 
with  the  increase  of  the  number  of  processors. 


•-   Brief  Description 

—  This  program  runs  in  17  transputers: 

—  transputer  Root  -  prompts  the  user  and  triggers  the  other 

—  transputers  . 

—  -  times  the  whole  process  execution 

-  receives  the  results  and  send  to  the  screen 
transputers  00,10,20,30  -  execute  two  processes  in  parallel: 

—  -  routes  the  trigger  and  the  results, 

—  -  executes  the  procedure  counter 

•--   transputers  01,02,03,11,12,13,21,22,23,31,32,33  (12) 

—  -  executes  the  procedure  counter  only 


--  PROGRAM  testlinearityl7 

--  testlinearity 

--  SC  PROC  hostproc 

--  PROC  hostproc 

PROC  hostproc  (CHAN  A,B, C,D,E,F,G,H)  = 

--  global  definitions  (partial) 
--  Constants  Definitions 


DEF 

Dort 
Daud 

= 

0: 

DEF 

= 

11 

DEF 

null 

= 

0: 

DEF 

bell 

= 

7: 

DEF 

tab 

= 

9: 

DEF 

If 

= 

10 

DEF 

cr 

= 

13 

DEF 

esc 

= 

27 

DEF 

sp 

= 

32 

■-  assign  the  i/o  port  of  the  BOOl  to  the  terminal 
set  the  baud. rate  to  9600  bps 

constantly  used  ASCII  values 


--  Channels  Definitions 
CHAN  Parameters  AT  0  : 


CHAN  Screen  :   --- 

•  AT  1: 

CHAN  Keyboard:  ---  AT  2: 

--  Link  Definitions 

DEF  linkOout  =  0 

DEF  linklout  =  1 

DEF  link2out  =  2 

DEF  linkSout  =  3 

DEF  linkOin  =  4 

DEF  linklin  =  5 

DEF  link2in  =  6 

DEF  link3in  =  7 

This  placements  cannot  be  done  in  TDS.  The 
terminal. driver  already  takes  care  of  that 


File  Handler  Control  Values 
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DEF  ClosedOK  =  -1 

DEF  CloseFile  =  -2 

DEF  EndBuffer  =  -3 

DEF  EndFile  =  -4 

DEF  EndMame  =  -5 

DEF  EndParameterString  =  -6 

DEF  EndRecord  =  -7 

DEF  MextRecord  =  -9 

DEF  OpenedOK  =  -10 

DEF  CpenForRead  =  -11 

DEF  OpenForWrite  =  -12 

--  library. occ   (partial) 
--  io_routines.occ   (partial) 
--  Sumnary  of  i/o  PROCs 

PROC  new. line      generates  a  CR  and  a  LF 

PROC  write. string  outputs  a  string  to  the  screen,  byte  by  byte 

PROC  clear. screen  clears  the  screen  and  home  the  cursor 

PROC  write. number  displays  an  integer  value  in  the  screen 

--  PROC  new. line 

Jumps  to  a  new  line  on  the  screen 

PROC  new. line  = 
SEQ 

Screen  !  cr; If ; EndBuffer  : 

--  PROC  write. string  (VALUE  string[]) 

___  ■k7K7^i^■ki^;7;7^■k■k■k7;■k■k■k7;y^:■:;r■k■k^k7^^kifi^;■k■k^k^:^7^7ri:^■kir■kir•k:kir:■k■ki^■kiri^■k 


7k:* 


Writes  a  given  string  to  the  screen,  in  a  byte  by  byte  fashion 

---  ■k■k7<;■kr:■k:k■k7;:^■k■ki^■ki<7;i^7^7^■k■A■k7^7^■k^:k■k7^;7;■ki<■ki^■k■ki^■k^^ki^:i(:^ 

PROC  write. string  (VALUE  stringfl)  = 

SEQ 

SEQ  i  =  [1  FOR  string[BYTE  0]] 

Screen  !  string[ByTE  i] 
Screen  !  EndBuffer  : 

Clears  the  screen. 

PROC  clear. screen  = 
SEQ 

Screen  !  esc;  '-';  '2';  'J';  EndBuffer  clear  screen  sequence 

Screen  !  esc;  '-';  'H'  :  home  cursor 

--  PROC  write. number  (VALUE  number) 

This  PROC  outputs  a  signed  integer  value  to  the  screen   * 

PROC  write. number (VALUE  number)  = 
VAR  output [16] ,  count,  x: 
SEQ 

x:=  number 
count :=  0 
IF 

handle  special  cases 
x=0 

Screen  !  '0' 
x<0 
SEQ 

Screen  !  ' - ' 

X:=-X 

TRUE 
SKIP 

WHILE  x>0 

construct  number 
SEQ 

output[count]  :=  (x   10)  +  '0' 
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count  :=  count  +  1 
x:=  x/10 

WHILE  count  >  0 

output  number 
SEQ 

count  :=  count-1 
Screen  !  output[count] 

SKIP: 
•-   utilities. occ 


--   PROC  tick. to 

-^-k-k-k-k-k-kik-k-kik-k 


.time  (VALUE  start,  stop  board. type) 

Receives  start  and  stop  time  and  board  type  and  outputs 

the  elapsed  time  in  hours,  minutes,  seconds  and  milliseconds 

PROC  tick. to. time  (VALUE  start,  stop,  board. type)  = 


board. type  =  0 > 

—  board. type  =  1  > 

—  board. type  =  2  > 

—  board. type  =  31 > 

—  board. type  =  32 > 

—  board. type  =  4  > 

•-  constant  definitions 


VAX  VMS 

BOOl 

B002 

B003 

B003 

B004 


high  priority  ) 
low  priority  ) 


DEF  vax.sec 
DEF  vax.mili 
DEF  bOOl.sec 
DEF  bOOl.mili 
DEF  b003h.sec 
DEF  b003h.mili 
DEF  b0031.sec 
DEF  b0031.mili 


=10000000 

10000 

=   625000 

625 

=  1000000 

1000 

15625 

16 


—  hundreds  of  nsec/second 

■--  hundreds  of  nsec/millisecond 

—  #  of  1.6  microsec/second 

—  #  of  1.6  microsec/millisecond 

—  #  of  microsec/second 

—  #  of  microsec/millisecond 

—  #  of  64  microsec/second 

—  #  of  64  microsec/millisecond 


DEF  max. number. of .ticks  =  2147483648 


maximum  integer  (2'*^*31) 


VAR  elapsed. tick  : 

VAR  factorl,  factor2  : 

VAR  msec,  tot. sec,  sec,  min,  hr 

SEO 
"IF 

board. type  =  0 
SEQ 

factorl  :=  vax.sec 
factor2  :=  vax.mili 


board. type  = 
SEQ 

factorl  : 
factor2  : 

board. type  = 
SKIP 

board. type  = 
SEQ 

factorl  : 
factor2  : 

board. type  =  32 
SEQ 

factorl 
factor2 


---  VAX  VMS 


---  BOOl 


=  bOOl.sec 
=  bOOl.mili 


2 

31 


B002 


will  be  implemented  in  the  future 

B003  in  high  priority 


=  b003h.sec 
=  b003h.mili 


■--  3003  in  low  priority 


=  b0031.sec 
=  b0031.mili 


board. type  =  4 
SKIP 

elapsed. tick  :=  stop  -  start 
IF 

elapsed. tick  <  0 


---  B004 
■-  will  be  implemented  in  the  future 
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elapsed. tick  :=  elapsed. tick  +  max.number .of . ticks 

TRUE 
SKIP 


=  elapsed. tick/factorl 

=  tot. sec/3600 

=  (tot.sec\3600)/60 

=  tot.sec\60 

=  (elapsed. tick  factorl)/factor2 


tot. sec 

hr 

min 

sec 

msec 

--  output  time  to  screen 
write. number  (hr) 
write. string  ("  nr  ") 
write. number  (min) 
write. string  ("  min  ") 
write .number (sec) 
write. string  ("  sec  ") 
write .number (msec) 
write. string  ("  msec' 


■kif-k9^-k-ki^-k-k-k-k-k-k-k-k-k-k-k:k:k-k:k-k-k-k-k-k:kir-k^-k^-k-k-k-k-k-k-k-k 


) 

--  PROC  capitalize  (VAR  ch) 

capitalizes  any  lower  case  character  into  upper  case 

PROC  capitalize  (VAR  ch)  = 

DEF  delta  =( 'a'  -  'A' ) 

---  A  --->  65 

ASCII  values 


>=  'a' ) 


A 

> 

65 

a 

> 

97 

z 

> 

122 

SE 

% 

(ch  <=  'z 

' )  AND  (ch 

ch  :=  C 

:h  -  delta 

TRUE 

SKIP  : 

--  1; 

ink  placements 

CHAN 

hostinO  AT 

linkOin 

CHAN 

hostml  AT 

linklin 

CHAN 

hostin2  AT 

link2in 

CHAN 

hostinS  AT 

linkSin 

CHAN 

hostoutO  AT 

linkOout : 

CHAN 

hostoutl  AT 

linklout : 

CHAN 

hostout2  AT 

link2out : 

CHAN 

hostoutS  AT 

linkSoi. 

It: 

--  PROC  terminal  driver 

The  terminal  driver  is  the  one  provided  by  the  manufacturer 
for  the  bOOl  board  and  therefore  is  not  included. 


189 


--  PROC  user . interface 

Receive  flag  from  the  user  and  triggers  the  network 

Receive  results  from  the  netv;ork  and  output  to  the  screen 

PROC  user .interface  = 

--  local  constant  and  variable  declaration 

DEF  tot  =  16  :  number  of  transputers 

VAR  ch:        flag  from  the  user 

VAR  result[tot]  : 

VAR  startimeroot,  endtimeroot:    timers  for  the  root 

VAR  startime[totJ ,  endtime [ tot] :  timers  for  the  16  transputers 


SEQ 


write. stringC  Type  any  character  to  start  ") 

Keyboard  ?  ch 

Screen  !  ch 

nev/.iine 

TIME  ?  startimeroot 

PAR 

-  send  flags 


hostoutO 
hostoutl 
hostout2 
hostoutS 


ch 
ch 
ch 
ch 


--  receive  results 
SEQ  i  =  [0  FOR  4] 
PAR 

hostinO  ? 

hostinl  ? 

hostin2  ? 

hostinS  ? 
SKIP 


result 
result 
result 
result 


i] ;startime[il ; endtime [i] 


i+4 
i+3 


;startime  i+4 
startimel i+8 


'i+12] ;startime [i+: 


; endtime [ i+4' 
; endtime [i+8' 
2] ;endtime[i+12' 


TIME  ?  endtimeroot 
--  send  results  to  the  screen 
SEQ  j  =  [0  FOR  tot] 
SEQ 

write. string  ("Transputer  ") 

write. number  (j) 

Screen  !  sp ;  sp 

write. number  (result[j]) 

Screen  !  sp;  sp 

tick. to. time  (startime[j] ,endtime[j] ,32) 

new. line 
SKIP 

--  send  total  execution  time  to  the  screen 
new. line 

write. string  ("  Time  to  execute  in  parallel  ") 
write. string  ("  with  17  transputers  =>  "' 
tick. to. time  (star time root, endtime root, 1^ 


PAR 

IMS .BOOl . terminal. driver(Keyboard, Screen, port, baud) 
user . interface  : 

■-   End  of  code  for  transputer  Root. 
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--  SC  PROC  Route 

--  PROC  Route  (CHAN  messagein,  messageout,  routetol , . . . ,VALUE  k) 
PROC  route(C?iAN  messaqein,messageout,  routetol ,  routeto2  ,  routetoS , 

ecnof roml , echof rom2 , echofromS , VALUE  k)= 

DEF  i  =  4  :    number  of  counter  procedures 

VAR  msg  :      flag 

VAR  results [i]  : 

VAR  startime [i] ,endtim3[i] :   timers 

CHAN  softin, sortout :   soft  channels  declared  for  communication 

with  procedure  counter. 

--  SC  PROC  counter 

--  PROC  counter 

PROC  counter  (CHAN  in, out,  VALUE  tnumber)  = 

--  description 

Sums  up  the  first  100000  integers  and  add  the  transputer  number 

to  the  total 

DEF  maxope  =  100000:   number  of  operations  done 

VAR  ch, total  : 

VAR  startimeS,  endtimeS : 

SEQ 

total  :=  tnumber 

in  ?  ch 

TIME  ?  startime3 

SEQ  i  =  [0  FOR  maxope] 

total  :=  total  +  i 
TIME  ?  endtimeS 
out  !  total;startime3;endtime3 : 

--  descriptor 

--  code 

SEQ 
PAR 

counter  (sof tout, softin, k) 
--  routing  procedure 
SEQ 

messagein  ?  msg 
SEQ 
PAR 

routetol  !  msg 
routeto2  !  msg 
routetoS  !  msg 
softout  !  msg 
PAR 

echofroml  ?  results-0- ;startime-0- ;endtime-0- 
echofrom2  ?  results-1- ;startime-l- ;endtime-l- 
echofromS  ?  results-2- ;startime-2- ;endtime-2- 
softin  ?  results-3-;  startime-S- ;endtime-3- 

--  sending  to  the  root  results  and  timing 
SEQ  i  =  [0  FOR  4] 

messageout  !  results[i] ;startime[i] ;endtime[i] : 


End  of  code  for  transputers  Routers  (00 . 10 ,20 , SO) 


■k-k^-k-ki^-k'k-k'k-k-ki^-k-k-k-k 
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--  SC  PROC  counter 

--  PROC  counter  (CHAN  in, out,  VALUE  tnumber) 
PROC  counter  (CHAN  in, out,  VALUE  tnumber)  = 
--  description 

—  Sums  up  the  first  100000  integers  and  add  the  transputer  number 
to  the  total.  Sends  the  result  thrpugh  channel  out. 

DEF  maxope  =  100000: 

VAR  ch, total  : 

VAR  startime,  endtime : 

SEQ 

total  :=  tnumber 

in  ?  ch 

TIME  ?  startime 

SEQ  i  =  [0  FOR  maxope] 

total  :=  total  +  i 
TIME  ?  endtime 
out  !  total; startime; endtime : 

End  of  code  for  transputers  Leaves   (01,02,03,11,12,13,21,22,...  ) 
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--  configuration 

--  link  definitions 
DEF  linkOin  =  4 
DEF  linkOout  =  0 
DEF  linklin  =  5 
DEF  linklout  =  1 
DEF  link2in  =  6 
DEF  link2o-at  =  2 
DEF  linkSin  =  7 
DEF  linkSout  =  3 

DEF  root  =  100: 
DEF  totlinks  =  32 
CKAN  pipe [totlinks] : 

PLACED  PAR 

PROCESSOR  root 

--  link  placements 


and  process  assignment 


PLACE  pipe[0]  AT  linkOin 

PLACE  pipe'l'  AT  linkOout 

PLACE  piDe[2'  AT  linklin 

PLACE  pipe, 3  AT  linklout 

PLACE  pipe  '4"  AT  link2in 

PLACE  pipe '5'  AT  link2out 

PLACE  pipe '6'  AT  link3in 

PLACE  pipe [7 J  AT  linkSout 

hostproc  (piDe[0] ,pipe [21 , 

pipe[lj ,pipe[3] ,pipe 

PLACED  PAR  1  =  [0  FOR  4] 


?iPf]4l 


,pipe[6l , 
5] ,pipe[7 


]) 


PROCESSOR  10*j 

--  link  placements 
PLACE  piDe[2'^j]  AT 
PLACE  pipe ^ (2*3 )+l] 


PLACE 
PLACE 
PLACE 
PLACE 
PLACE 
PLACE 


pipe 
pipe 
pine 
pipe 


pipe  12+ 


pipe 


8+(6*i 
9+(6*3 
10+(6^i 
ll+(6^3 


13+ 


and  process 
linktDout  : 
AT  linkOin  : 

AT  link2in  : 
AT  link2out  ; 

AT  linklin  ; 

AT  linklout 

AT  link3in  ; 

AT  linkSout 


assignment 


route 
pipe 

PLACED  PAR  i  =  [0  FOR 
PROCESSOR  (10*i)+l 

■-  link  placements  and  process  assignment 


fpipe[(2*j)+l]  ,piper2^j]  ,piper9+(6y)]  ,pipe [ll+(6*j   , 
i3+(5'^3)-,pipe-3+(6*j)-,pipe-10+(6=^j)--pipe-12+(6*j)-,10*3) 

4] 


PLACE 
PLACE 


pipe 
pipe 


8+ 
9+ 


AT 
AT 


linkSout 
linkSin 


counter(pipe[9+(6*i)] ,pipe[8+(6*i)] , ((10*i)+l)) 


PLACED  PAR  i  =  [0  FOR  4] 
PROCESSOR  (10*i)+2 
--  link  placements 


PLACE 
PLACE 


6*i 


and  process  assignment 

AT  linklout  : 
AT  linklin  : 


pipe [10+ 
pipe [ll+(6*i 

counter (pipe [ll+(6*i)] ,pipe[10+(6*i)] 

PLACED  PAR  i  =  [0  FOR  4] 
PROCESSOR  (10*i)+S 

--  link  placements  and  process  assignment 


((10*i)+2)) 


PLACE 
PLACE 


pipe 
pipe 


12+ 
13+ 


6^i 
6*i 


AT 
AT 


link2out 
link2in 


counter (pipe [ 13+ (6*i)] ,pipe[12+(6*i)] ,((10*i)+3)) 
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APPENDIX  G 
TRANSPUTER  PRODUCTS* 

a.  Transputers 
IMS  T414B-G15S  -  32  bit  transputer  -  15mhz 
IMS  T414B-G20S  -  32  bit  transputer  -  20mhz 
IMS  T800B-G20S  -  32  bit  floating  point  transputer  -  20mhz 
IMS  T212A-G17S  -  16  bit  transputer  -  17mhz 
IMS  T212A-G20S  -  16  bit  transputer  -  20mhz 
IMS  M212B-G15S  -  Winchester  and  Floppy  disk  controller 

b.  Evaluation  Boards 
IMS  B002-2  -  T  414  with  2MBytes  DRAM  with  2  x  RS232 
IMS  B003-1  -  Described  in  Chapter  I 
IMS  B003-2  -  4  X  T  414  -  20mhz  each  with  256KB  DRAM 
IMS  B004-4  -  Described  in  Chapter  I 

IMS  B005-1  -  M212  with  64kbytes  SRAM,  20MB  WINI,  640K  Floppy 
IMS  B006-1  -  T212  with  64kbytes  SRAM,  and  2  x  RS  232 
IMS  B006-2  -  T212  with  64kbytes  SRAM,  and  8  x  T212  (8k  SRAM) 
IMS  B007-1  -  Graphics  Evaluation  Board  with  1  T414,  512k  DRAM 

c.  Digital  Signal  Processing 

•  ,      IMS  A100-G20S  -  32  Stage  cascadeable  signal  processor 

*  All  trademarks  on  this  page  are  registered  trademarks  from 
INMOS  Group  of  Companies,  Bristol,  UK. 


194 


LIST  OF  REFERENCES 


1.  Garret,  D.  R.,  A  Software  System  Implementation  Guide  and  System  Prototyping 
Facility  for  the  M CORTEX  Executive  on  the  Real  Time  Cluster,  M.  S.  Thesis, 
Naval  Postgraduate  School,  Monterey,  California,  December  1986. 

2.  Weitzman,  C.  Distributed  Micro; Mini-computer  Systems,  Prentice-Hall,  New 
Jersey,  1980. 

3.  Peterson,  J.  &  Silberchatz,  A.,  Operating  Systems  Concepts,  Second  Edition, 
Addison-Wesley  PubUshing  Co.,  Reading,  Massachusetts,  1986. 

4.  Evin,  B.  ,  Implementation  of  a  Serial  Delay  Insertion  Type  Loop  Communication 
for   a   Real   Time   Multitransputer   System,    M.    S.    Thesis,    Naval   Postgraduate 

School,  Monterey,  California,  June  1985. 

5.  Selcuk,  Z.,  Implementation  of  a  Serial  Communication  Process  for  a  Fault 
Tolerant,  Real  Time,  Multitransputer  Operating  System  M.  S.  Thesis,  Naval 
Postgraduate  School,  Monterey,  California,  December  1984. 

6.  Cordeiro,  M.  M.,  Design,  Implementation  and  Evaluation  of  an  Operating  System 
for  a  Transputer  Network,  M.  S.  Thesis,  Naval  Postgraduate  School,  Monterey, 
California,  June  1987. 

7.  INMOS  Limited,  Transputer  Reference  Manual,  October  1986. 

8.  Miller,  Neil  Exploring  Multiple  Transputer  Arrays,  INMOS  Technical  note  24, 
January  1987. 

9.  Kodres,  U.  R.,"Processing  Efficiency  of  a  Class  of  Multi-computer  Systems", 
International  Journal  of  Mini  and  Micro-computers,  Volume  5,  No. 2,  pp  28-33, 
1983. 

10.  Wilson,  P.,  "Occam  Architecture  Eases  System  Design  -  Part  1",  Computer 
Design,  Volume  22,  No.  13,  pp  107-110,  November  1983. 

U.  Wilson,  P.,  "Occam  Architecture  Eases  System  Design  -  Part  2",  Computer 
Design,  Volume  22,  No.  14,  pp  109-114,  December  1983. 

12.      Pountain,  D.,  A  Tutorial  Introduction  to  Occam  Programming,  1985. 


195 


13.  INMOS  Limited,  Occam  Programming  System,  1985. 

14.  INMOS  Limited.  IMS  D600   Transputer  Development  System,  1985. 

15.  INMOS  Limdted,  IMS  D701    Transputer  Development  System,  1985. 

16.  INMOS  Limited,  IMS  BOOI  Evaluation  Board  User  Manual,  1985. 

17.  INMOS  Limited,  IMS  B003  Evaluation  Board  User  Manual,  1985. 

18.  INMOS  Limited,  IMS  B004  Evaluation  Board  User  Manual,  1985. 

19.  Halsall,  F.,  Introduction  to  Data  Communications  and  Computer  Networks 
Addison-Wesley,  Workingham,  United  Kingdom,  1985. 

20.  Cellary,  W.  and  Stroinski,  M.,  "Analysis  of  Methods  of  Computer  Network 
Performance  Measurement",  Performance  of  Computer  Communication  Systems, 
Werner  Bax  and  Harry  Rudin  Editors,  North-Holland,  1984. 

21.  INMOS  Limited,  IMS  TSOO  Architecture  INMOS  Technical  note  6,  Bristol, 
United  Kingdom,  1986. 

22.  Naval  Postgraduate  School,  Computer  Science  Department,  VAX,' VMS 
Introduction,  by  Bruce  R.  .Montague,  January  1983,  revised  June  1986. 


196 


BIBLIOGR.A.PHY 

INMOS  Corporation,  Compiler  Writers  Guide,  Draft,  1986. 

IXVIOS  Corporation,  Transputer  America,  1986. 

INMOS  Limited.  Product  Information  -  The  Transputer  Tamily,  June  1986. 

MacClennan,   B.   J.,    Principles   of  Programming   Languages:    Design,   Evaluation   and 
Implementation,  CBS  College  Publishing,  New  York,  1983. 

Stallings,  \V,  Computer  Organization  and  Architecture,  Macmillan  Publishing  Company, 

New  York,  1987 

Mattos,    P.,    The    Transputer    Based   Navigation   System    -    An    Example    of   Testing 
Embedded  Systems,  INMOS  Technical  note  2,  November  1986. 

Mattos,    P.,    Program,    Design  for    Concurrent    Systems    INMOS    Technical    note    5, 
December  1986. 


197 


INITIAL  DISTRIBUTION  LIST 

No.   Copies 

1.  Defense  Technical  Information  Center  2 
Cameron  Station 

Alexandria,  VA   22304-6145 

2.  Library,  Code  0142  2 
Naval  Postgraduate  School 

Monterey,  CA   93943-5002 

3.  Department  Chairman,  Code  52  1 
Department  of  Computer  Science 

Naval  Postgraduate  School 
Monterey,  CA  93943 

4.  Dr.  Uno  R.  Kodres,  Code  52Kr  3 
Department  of  Computer  Science 

Naval  Postgraduate  School 
Monterey,  CA  93943 

5.  Dr.  Daniel  L.  Davis,  Code  52Dv  1 
Department  of  Computer  Science 

Naval  Postgraduate  School 
Monterey,  CA  93943 

6.  Daniel  Green,  Code  20F  1 
Naval  Surface  Weapons  Center 

Dahlgren,  VA  22449 

7.  Jerry  Gaston,  Code  N24  1 
Naval  Surface  Weapons  Center 

Dahlgren,  VA  22449 

8.  CAPT.  J.  Hood,  USN  1 
PMS  400B5 

Naval  Sea  Systems  Command 
Washmgton  D.C.  20362 

9.  RCA  AEGIS  Repository  1 
RCA  Corporation 

Government  Systems  Division 
Mail  Stop  127-327 
Moorestown,  NJ  08057 

10.  Library  (Code  E33-05)  1 
Naval  Surface  Weapons  Center 

Dahlgren,  VA  22449 


198 


11.  Dr.  M.  J.  Gralia 

Applied  Physics  Laboratory 
John  Hopkins  Road 
Laurel  MD  20702 

12.  Dana  Small,  Code  8242 
Naval  Ocean  Systems  Center 
San  Diego.  CA  92152 

13.  Estado  Maior  da  Armada 
Brazilian  Naval  Commission 
4706  Wisconsin  Ave.,  N.W. 
Washington,  DC  20016 

14.  Diretoria  de  Ensino  da  Marinha 
Brazilian  Naval  Commission 
4706  Wisconsin  Ave.,  N.W. 
Washington,  DC  20016 

15.  Diretoria  de  Armamento  e  Comunicacoes  da  Marinha 
Brazilian  Naval  Commission 

4706  Wisconsin  Ave.,  N.W. 
Washington,  DC  20016 

16.  Instituto  de  Pesquisas  da  Marinha 
Brazihan  Naval  Commission 
4706  Wisconsin  Ave.,  N.W. 
Washington,  DC  20016 

17.  Instituto  Militar  de  Engenharia 
Praia  Vermelha.  Urea 

Rio  de  Janeiro,  RJ 
CEP  20000  ,  BR.'XZIL 

18.  Instituto  Tecnologico  da  Aeronautica 
Sao  Jose  dos  Campos,  SP 

CEP  11000  ,  BR.'\ZIL 

19.  Pontificia  Universidade  Catolica 

R.  Marques  de  Sao  Vicente  225,  Gavea 
Rio  de  Janeiro,  RJ 
CEP  20000  ,  BRAZIL 

20.  Pete  Wilson 

INMOS  CORPORATION 

P.O.  Box  16000 

Colorado  Springs,  CO  80935-16000 

21.  David  May 
INMOS  LTD. 
1000  Aztec 

West  Almondsbury,  Bristol,  BS12  4SQ,  UK 


199 


22.  MAJ/USAF  R.  A.  Adams,  Code  52Ad 
Department  of  Computer  Science 
Naval  Postgraduate  School 
Monterey,  CA  93943 

23.  LCDR.  J.  Vanni  Filho,  Br.  \a\7 
Brazilian  Naval  Commission  (  DACM  ) 
4706  Wisconsin  Ave.,  N.W. 
Washington,  DC  20016 

24.  LCDR.  Gilberto  F.  Mota,  Br.  Naw 
Brazilian  Naval  Commission  (  DACM  ) 
4706  Wisconsin  Ave.,  N.W. 
W^ashmgton,  DC  20016 

25.  LT.  M.  -M.  Cordeiro,  Br.  Navy 
Brazilian  Naval  Commission  {  DACM  ) 
4706  Wisconsin  Ave.,  N.W. 
Washineton,  DC  20016 


200 


^8.Q5  4/> 


2^?,^Y  KNOX  LIBRARY 
WAVAL  POSTGRADUATE  <irrj^^r 
MCWTEB,EY    GaI  IFOR  m  .  0^^°^ 

'-/U.IFORWIA  95943-6002 


F4275        Filho 

c.l  Test  ^p<Kevaluation  of 

the  Xr^sputer  in  a 


i-transputer  system. 


